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Abstract 

*A  mathematical  formulation  of  the  Rigid  Motion  Perception  problem  is 
described.  The  constraints  on  the  parameters  of  rigid  motion  <  i.e..  three- 
dimensional  velocities)  obtained  from  image  motion  data  (two- 
dimensional  projected  velocities)  are  analyzed.  A  brief  survey  of  related 
work  shows  the  lacunae  in  the  existing  body  of  research  in  this  area. 
Uniqueness  results  and  computational  algorithms  are  presented  to 
compute  the  rigid  motion  parameters  from  retinal  velocities.  The 
approximations  involved  in  the  velocity  representation  are  stated. 
Algorithms  and  constraints  to  permit  cooperative  computation  of 
motion  and  shape  are  described. 
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20.  ABSTRACT  (Continued) 


1.  Introduction 


Motion  is  a  ubiquitous  phenomenon  in  our  everydav  life.  It  is  therefore 
important,  in  the  study  of  Computer  Vision,  to  understand  the  retinal  motion  flux 
arising  both  from  movement  of  the  observer,  as  well  as  the  motion  of 
environmental  objects.  The  studv  of  the  motion  of  rigid  objects  (or  surfaces),  in 
particular,  is  a  relevant  avenue  for  in\ estigating  motion  perception.  In  general, 
computing  three  dimensional  motion  from  monocular  two  dimensional  image 
motion  flux  is  an  underdetermined  problem,  admitting  an  infinite  number  of 
solutions.  The  assumption  of  rigidity  makes  the  problem  tractable  (see  Ullman's 
paper  [33]  for  a  discussion  of  nonrigid  motion  perception).  Furthermore,  most  of 
the  moving  objects  in  our  environment  are  rigid.  From  a  practical  standpoint,  the 
studv  of  rigid  bodv  motion  is  interesting,  since  it  finds  widespread  applications  in 
the  areas  of  optical  navigation,  tracking  and  recovery  of  3D  structure  of  rigid 
objects. 

The  motion  of  a  bodv  can  be  characterized  bv  the  rate  of  change  of  the 
positions  of  various  points  on  its  visible  surface.  Thus,  at  least  instantaneously,  this 
corresponds  to  a  three  dimensional  velocity  field.  If  the  body  (or  surface)  is  rigid, 
then,  this  velocity  field  can  be  described  by  a  vector  function  of  the  three 
dimensional  position  coordinates  and  six  global  parameters  (  see  figure  I),  which 
are: 

(i)  The  three  components  of  the  velocitv  of  anv  point  O  on  the  bodv.  These  are 

called  the  translation  parameters. 


(ii)  The  rotational  velocity  components  of  a  coordinate  frame.with  origin  O. 

attached  rigidly  to  the  body. 

It  is  a  standard  result  from  kinematics  and  geometry  (see  [7])  that  although  the 
rotational  parameters  are  invariant  with  respect  to  the  choice  of  the  origin  O.  of 
the  body  frame,  the  translation  parameters  are  dependent  on  the  choice  of  O. 

When  considering  motion  of  rigid  bodies,  there  are  two  cases  of  interest, 
namely,  Egomotion  and  General  Motion.  Egomotion  or  self-motion  refers  to  the 
movement  of  the  camera  or  sensor  in  a  static  environment.  The  image  flux,  or 
optical Jl cm.  generated  due  to  such  a  motion  is  due  to  a  single  relative  movement, 
i.e.  between  observer  and  static  environment.  In  contrast.  General  motion  implies 
that  there  is  more  than  one  object  moving  with  different  velocities  in  the  observers 
field  of  view.  In  this  case  the  optical  flow  field  consists  of  many  segments 
corresponding  to  the  various  moving  surfaces.  Each  segment  is  characterized  by 
the  translational  and  rotational  velocities  of  the  associated  moving  rigid  surface 
inducing  the  optical  flow.  These  velocities  are  called  the  parameters  of  motion  for 
the  rigid  surface. 

The  rigid  motion  parameters  are  usually  expressed  with  respect  to  a  frame  of 
reference  attached  to  the  moving  surface,  which  is  assumed  to  coincide  with  the 
observers  frame  of  reference  at  the  time  of  observation.  The  problem  is  to 
determine  the  motion  parameters  corresponding  to  a  optical  flow  field  segment.  If 
the  depth  of  the  scene  is  unknown  then  it  can  be  shown  that  only  the  rotation  - 
which  is  depth  invariant  -  can  be  determined  uniquely;  whereas  the  three 


translation  parameters  can  onlv  be  determined  upto  a  scale  factor  (this  is  the  depth 
scaling  effect).  Thus  we  can  determine  five  parameters  to  characterize  the  motion 
in  this  case. 

Motion  in  three  dimensions  causes  the  pattern  of  light  falling  on  the  retina 
(or  any  two  dimensional  array  of  photo-sensitive  elements)  to  varv  in  time  in 
accordance  with  the  motion.  Hence,  the  input  (or  stimulus)  to  any  computational 
process  endeavouring  to  understand  the  motion,  is  the  two  dimensional  projection 
of  the  three  dimensional  motion.  Since  a  velocity  field  is  a  good  representation  for 
the  three  dimensional  motion,  it  is  customary  to  choose  a  two  dimensional  \ elocity 
field  representation  for  the  image  or  retinal  motion.  The  latter  is  called  optical 
flow. 

The  problem  addressed  in  this  paper  concerns  the  computation  of  the 
parameters  of  rigid  motion  and  the  structure  of  the  moving  surface  from  retinal 
stimulus  such  as  optical  flow. 

The  optical  flow  field  is  a  principal  source  of  information  about  the  motion, 
inducing  the  "flow",  as  well  as  the  3D  structure  of  the  moving  surface  being 
observed  .  The  optical  flow  comprises  two  parts,  corresponding  to  the  rotation  and 
the  translation,  respectively,  of  the  inducing  motion.  The  optical  flow  due  tc  rigid 
motion  is  constrained  at  every  point  b>  the  parameters  of  the  motion.  However, 
since  the  parameter  space  has  a  large  dimension  and  the  constraint  is  nonlinear  in 
form,  computation  of  the  motion  from  optica!  flow  (or  image  displacements)  b> 
search  techniques  is  computation  intensive. 


The  optical  flow  field  is  mathematicallv  separable  into  a  translational  pan  and 
a  rotational  component.  It  has  been  long  recognized  [12]  that  the  motion 
perception  becomes  simpler  in  the  instances  when  the  optical  flow  field  can  be 
computational^  separated  into  the  translational  and  rotational  parts.  A  familiar 
illustration  of  this  is  the  case  of  motion  parallax  observable  at  depth  discontinuities 
in  the  retinal  field.  The  effect  is  to  reduce  the  dimensionalin  of  the  space  of 
unknowns.  Unfortunately  this  seems  to  be  verv  hard  to  accomplish,  in  general. 
Motion  parallax  is  the  basis  foi  an  algorithm  b>  Lawton  [24],  Other  approaches  to 
the  problem  can  be  found  m  [m  19],  involving  nonlinear  least  square  techniques  or 
using  local  constraints  involving  derivatives  of  the  optical  flow. 

As  stated  previously  algorithms  for  rigid  motion  perception  are  difficult  to 
design  due  to  two  main  reas  •  - 

(1)  The  space  of  parameter',  o  '  high  dimensionalitv  (e  g.  five). 

(2)  The  Constraint  equations  obtained  bv  optical  flow  measurements  are  non¬ 
linear. 

There  have  been  some  clever  implementations  of  non-linear  search  algorithms 
to  interpret  3D  motion  from  optical  flow  data  [22,23].  There  have  also  been 
discrete  point  tracking  algorithms  bv  Tsai  and  Huang  [30]  and  Fang  and  Huang 
[8.9]  and  Longuet-Higgins  jzuj  in  some  of  the  latter  algorithms,  the  nonlinear 
motion  equations  are  linearized  in  terms  of  svnthetic  parameters,  which  are 
nonlinear  combinations  of  the  actual  motion  parameters.  Tsai  and  Huang,  and 
Fang  and  Huang,  note  the  eases  when  such  algorithms  fail  to  compute  motion 


parameters. 


In  this  paper  we  examine  the  situations  when  the  optical  flow  field  is  capable 
of  being  interpreted  in  more  than  one  way.  An  instance  of  such  ambiguity  is  the 
optical  flow  field  due  to  motion  of  a  plane  [29], 

A  geometric  analysis  of  the  problem  of  computing  3D  motion  parameters 
from  2D  image  velocities  has  been  done  by  Longuet-Higgins  and  Prazdny  [19], 
The  constraint  equations  that  they  derive  are  simple  in  form,  but  deal  with 
velocities.  To  implement  a  motion  analysis  algorithm  based  on  these  equations, 
one  makes  the  assumption  that  the  temporal  grain  of  the  observations  is  fine 
enough  to  talk  meaningfull..  about  the  velocities  or  time  derivatives  of  both  the 
image  and  world  positions.  Representing  motion  by  velocity  parameters  entails 
making  a  first  order  approximation  of  the  temporal  behaviour  associated  with  the 
motion.  Thus,  for  example.  it'  the  displacement  of  a  panicle  moving  in  one 

dimensional  space  is  A>  m  time  a.\  then  —  is  a  good  approximation  for  the 

velocity  only  when  A/  is  small  enough  such  that  the  change  in  velocity  in  this  time 
period  is  small. 

An  alternative  derivation  i>  due  to  Tsai  and  Huang  [30],  Their  approach  is  to 
analyze  the  relation  between  the  projected  displacement  vectors  in  the  image  plane 
due  to  an  arbitrary  rigid  displacement  of  a  set  of  points  in  3D.  It  is  known  [7]  that 
this  type  of  motion  can  be  characterized  by  a  rotation  about  an  axis  passing 
through  the  origin  of  the  rcleieme  coordinate  frame  and  a  translation. 
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The  treatment  in  this  paper  assumes  the  velocity  representation  for  rigid 
motion.  The  assumptions  underlying  the  work  reported  here  are: 

(i)  The  motion  being  observed,  is  due  to  a  rigid  surface. 

(ii)  The  time  constant  (or  sampling  interval)  of  the  sensor  is  small  enough  to 
make  a  first  order  approximation  of  the  temporal  behaviour  due  to  the 
motion  being  observed. 

1.1.  Review  of  previous  work 

The  computation  of  rigid  motion  parameters  from  image  displacement  vector 
fields  has  been  studied  by  a  number  of  researchers.  Egomotion  has  been 
considered  in  the  literature  bv  Longuet-Higgins  and  Prazdnv  (19).  Prazdnv  [22). 
Waxman  and  Ullman  [34]  and  Bruss  and  Horn  [6].  Longuet-Higgins  and  Prazdnv 
examine  ways  of  determining  3D  structure  and  motion  parameters  from  optical 
flow.  Their  method  depends  upon  accurate  reconstruction  of  the  optical  flow  field. 
An  interesting  result  due  to  them  is  that  for  non  planar  surfaces  local  analysis  of 
the  flow  field  vields  a  cubic  constraint  involving  the  motion  parameters.  Prazdnv 
([22])  has  devised  a  five  point  algorithm  to  solve  for  the  motion  parameters  from 
nonlinear  constraint  equations.  Waxman  and  Ullman's  method  depends  upon 
reconstruction  of  the  optica!  flow  field  analytically,  in  local  neighbourhoods.  Bruss 
and  Horn  propose  a  least  square  solution  to  the  parameter  estimation  problem. 

Some  other  computational  approaches  attempt  to  segment  the  optical  flow 
field  into  translators  and  rotators  components,  albeit  approximate!;. .  An  example 
is  the  method  of  Reiger  and  Lawton  [24]  where  the  change  of  rotational  flow  at 
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steep  depth  gradients,  is  treated  as  noise.  Jain  [17, 18]  computes  the  focus  of 
expansion  before  computing  the  image  displacements  and  uses  the  former  to  guide 
the  correspondence  for  finding  the  latter. 

All  the  above  analyses  pertain  to  the  computation  of  motion  parameters  from 
optical  flow,  i.e.  continuous  or  differential  image  motion.  An  alternative  approach 
is  to  consider  evaluating  the  motion  parameters  and  3D  structure  from  discrete 
point  correspondence.  Ullman  [32]  shows  that  three  views  of  four  non  coplanar 
points  is  adequate  to  determine  the  structure  and  motion  of  these  points  under 
orthography.  Tsai  and  Huang  [30]  prove  that  the  motion  of  seven  points  not  King 
on  two  planes,  one  of  which  passes  through  the  origin,  nor  on  a  cone  passing 
through  the  origin,  can  be  uniquely  computed,  from  discrete  displacements.  Fang 
and  Huang  [8.9]  prove  that  structure  and  motion  of  nine  points  not  King  on  a 
second  order  surface  passing  through  the  origin  is  uniquely  determined  from 
image  displacements.  Nagel  and  Neuman  [21]  and  Roach  and  Aggarwal  [25]  have 
also  looked  at  the  problem  of  determining  motion  from  discrete  displacements. 

Yet  another  approach  to  the  problem  of  motion  parameter  computation  has 
been  to  restrict  the  motion  to  simplify  the  analysis.  Webb  and  Aggarwal  [35] 
Hoffman  and  Flinchbaugh  [14]  and  Hoffman  and  Bennett  [15]  analyze  rigid 
motion  with  the  additional  assumption  of  fixed  axis  of  rotation  or  planarity .  An 
major  motivation  for  this  type  of  analysis  is  that,  it  models  the  locomotion  of  man 


and  animals. 
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1.2.  Summary  of  Results  reported  here 

It  is  evident  from  the  review  of  the  existing  body  of  work  in  the  field  of 
motion  perception  that,  although  considerable  work  has  been  done,  much  remains 
undone.  Uniqueness  proofs  of  the  type  derived  by  Tsai  and  Huang  and  Fang  and 
Huang  do  not  allow  us  to  visualize  the  situations  when  the  optical  flow  field  is 
intrinsically  ambiguous,  admitting  more  than  one  interpretation.  An  analysis  of 
the  optical  flow  field  to  determine  cases  of  ambiguity  will  be  a  major  focus  of  this 
paper. 

When  the  image  formation  geometry  is  modeled  by  means  of  the  parallel 
projection  model,  the  constraint  equations  become  simplified.  This  is  also  called 
Orthographic  Projection  model  of  image  formation  (see  figure  lib).  The  attendant 
simplicity  in  the  motion  equations  can  be  used  to  considerable  advantage  in  the 
preliminary  analysis  of  the  motion  perception  problem.  The  following  results  are 
derived: 

1.  The  component  of  rotation  about  the  line  of  sight,  the  ratio  of  the  other  two 
components  of  rotational  velocity,  and  the  tilt  function  is  uniquely 
computable  from  a  single  optical  Aca  field,  for  a  rigid  non  planar  surface. 

2.  When  the  surface  normals  for  a  rigid  surface  are  known  then  the  motion 
parameters  can  be  computed  uniquely. 


The  Perspective  Projection  model  (see  figure  Ha)  is  a  more  accurate  model  of 
image  formation  by  eye  or  camera.  For  this  model  it  is  proved  that: 


1.  The  optical  flov.  field,  under  the  assumptions  of  rigidity  can  have  at  most 
three  interpretations. 

2.  The  rigid  motion  of  an>  surface  whose  depth  from  the  nodal  point  of  the 
sensor  cannot  be  expressed  bv  the  rational  function  ‘  ~  — .  where  P,  and 

Q  A  .1  > 

Q 2  are  rational  functions  of  the  first  and  second  orders  respectively.  is 
uniquely  computable  from  the  information  in  the  optical  flow  field. 

3.  Two  optical  flow  fields,  obtained  at  different  time  instants,  determine  the 
motion  parameters  uniqueh. 

4.  The  motion  parameters  are  uniquels  determined  from  the  optical  flow  field 
when  the  corresponding  motion  involves  rotation  only. 

5.  The  optical  flow  due  to  planar  surfaces  is  general!)  ambiguous.  However  this 
ambiguity  can  be  resolved  either  when  the  flow  field  is  due  to  more  than  one 
plane  moving  together  rigidly .  or  in  the  case  of  a  single  plane,  if  its  tilt  is 
known. 

7.  It  is  feasible  to  design  a  cooperative  algorithm  for  computing  both  shape  (e.g. 
surface  normals)  and  optical  flow,  under  conditions  of  rigid  motion. 

2.  The  Geometry  of  Rigid  Motion 

Consider  a  sensor  moving  relative  to  a  stat  scene.  The  co-ordinate  frame 
(X.Y.Z)  is  fixed  to  the  sensor  (see  figure  1).  The  viewing  direction  is  along  the 
positive  z-axis. 


II 


A  rigid  body  is  defined  as  a  set  of  points  whose  relative  euclidian  distances 
from  all  other  points  in  the  set  are  imariants  with  respect  to  the  transformations  of 
rotation  and  translation.  In  addition,  since  we  will  generally  deal  with  opaque 
objects  and  hence  will  observe  points  on  a  surface  (or  a  manifold)  in  3  space.  In 
other  words  the  3  cartesian  coordinates  of  of  a  point  on  a  rigid  bodv  are  not 
independent.  Formalh. 

B  =  ( v  J  ) 

where 

w  =  i  \  .)'./)[  point  on  the  surface  i  f  B  | 

M.V. >'./)=  0 

When  the  bod>  B  is  displaced  with  respect  to  the  frame  of  reference,  we 
obtain  a  new  representation 

B  -  (77  .  /  ) 

The  displacement  is  described  b>  the  affine  transformation 

V=|*IX-‘l  (2-1) 

Any  displacement  of  a  rigid  K>d\  can  be  modelled  bv  the  above  equation,  which 

describes  a  rotation  about  an  axis  through  the  origin  and  a  translation  specified  bv 

the  vector  T. 

If  the  rotation  angle  o  snu'.l.  it  can  be  decomposed  into  three  component 
rotations  about  the  individual  axes  separatelv  (16].  In  this  case  R  and  I  are  given 


b> 
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1 

-  a'  - 

/, 

R  = 

u: 

1 

T  = 

/, 

~  u.' 

a' ; 

1 

i. 

Substuting  for  R  and  ?  in  equation  (2.1)  we  have. 


=  V  -  oo. )  +  to,  Z  +  lx 

(2-2.1) 

=  }  +  to- A  -  to ,  Z  +  /, 

(2-2.2) 

=  Z  —  to ,  A  +(o,l  + 

(2-2.3) 

A  \  =  f ,  —  to  -  J  to ,  Z 

(2-3.1) 

A  >  -  I  - r  to  •  A  -  to ,  Z 

(2-3.2) 

A/  =  /.  -  to .  A  *  to,  } 

(2-3.3) 

where. 

A\  =  A  -  \  A)  =  )  -  >  AZ  =  Z  -  Z 
We  define  the  parameter  vector  a  for  characterizing  the  motion.where 

•  -  li,. tO  j  .00 ,  .tO  .  f 

Motion  perception  involve'  the  recover)  of  the  parameters  of  motion,  as  well  as 
the  structure  (or  shape)  of  the  moving  object.  The  geometric  properties  of  the 
three  dimensional  surfaces  and  points  are  related  to  the  geometry  of  their  image. 
Thus  the  projective  transformation  involved  in  the  image  formation  process  must 
be  analyzed.  The  subsequent  analysis  considers  both  the  cases  of  "perspective"  as 
well  as  "orthographic"  projections. 
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3.  Analysis  of  Motion  under  Orthography 

When  the  model  of  image  formation  involves  orthographic  or  parallel 
projection,  then  the  mathematical  formulation  of  the  problem  becomes 
considerably  simpler.  It  can  be  argued  that  this  is  a  valid  model  of  image 
formation  when  view  ing  distant  objects,  or  when  the  focal  length  of  the  camera  is 
large  compared  to  the  distance  of  the  v  iewed  surfaces,  or  when  the  v  iew  ing  area  is 
small  and  centered  around  the  line  of  sight  -  as  in  the  case  of  the  field  of  view 
corresponding  to  the  fovea  in  the  retina. 

Under  orthography,  the  projection  equation  relating  the  position  of  a  point  in 
three  space  /J  =  <  V  )  ./»  to  its  image  p  =  <  a  ,i  )  is: 


<x.\  )  =  < A  . >  ) 

Assuming  that  after  a  short  while  the  point  moves  to  a  position  given  by 

p  =  <A ".)  ./  )  while  its  image  moves  to  />'  =  (.»V)  the  following  relations  are 

obtained  from  equations  (2.3): 

A,  =  .*•-  <  =A.V  =  /,  -Uzr  +  W./  31) 

A.i  =  .v  -  i  =  A )  =  /,-*-  u:.\  -  a;,/ 

Optical  flow  is  the  time  derivative  of  the  image  position  vector  and  is  denoted  by 
iu.v)  where 

< u . v )  =  (jc.j  )  =  (A  .i  > 

Alternatively. 

lim  _  dx  lim  A^  _  Jy 

U_A/—  0  ~  j,  v  A/— 0  a?  Ji 

The  motion  parameters  are  now  the  translational  velocity  \T  =  <r.i  u  >  and  the 
rotational  velocity  Q  =  (a  0  y)  where: 
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S 


I 


r- 


and 


,  •  _  lim 
L  ~ 


_  lim  *-'>  a  _  nm  w> 

a"A/-*0A7  ^~A/^0"a7 

therefore  the  equations  relating  image  and  3D  motion  are 

1 i  =  l  ■*-/?/-  y\ 
i  =  l  -  a/  +  yx 

These  equations  are  exactly  identical  in  form  to  those  obtained  under  the  discrete 


lim 

A/-*0 


II 


lim  l: 
A/ -*0  A/ 


lim 


lim 

Az-^O^ 


(3.2) 


case  (assuming  small  rotation),  i.e.  equation  (3.1).  Strictly  speaking,  according  to 
the  nomenclature  adopted  before,  the  motion  parameters  for  the  discrete  case  are 
l'i.u,.u'..i  and  those  for  the  differential  case  are  a  l  a/fy).  However,  since 
equations  (3.1)  and  (3.2)  are  identical  in  form,  all  subsequent  analysis  is  based  on 
the  latter  equation,  f  urthermore,  the  parameters  (  it  will  be  evident  later  that  only 
the  rotational  parameters  are  of  interest  here),  in  both  the  differential  as  well  as 
the  discrete  cases  will  be  referred  to  by  the  symbols  <a.0  y).  The  treatment  of  both 
the  cases  is  identical,  the  only  difference  being  that  derivatives  in  the  differential 
analysis  correspond  to  differences  in  the  discrete  case. 


3.1.  On  the  information  available  in  the  optical  How  field 

Observe  from  equation  (3.2)  that  the  image  displacement  (or  image  motion 
field)  consists  of  a  translational  part  and  a  rotational  pan.  The  translational 
motion  parameters  are  dependent  on  the  origin  of  reference.  In  fact  the 
parameters,  intrinsic  to  the  motion  are  those  of  rotation.  Thus  relative  to  a 
particular  point,  say  the  origin  (0.0).  equation  (3.2)  becomes: 


where  u  actually  means  u  -  uio.oi.  v  is  >  -  v  (0.0)  and  /  is  /  -  /(O.Ot.  It  should  be 
emphasized  here  that  /  denotes  depth  relative  to  a  certain  point  of  reference  (  in 
this  case  it  is  the  origin  ).  If  the  structure  or  relative  depth  is  not  known  then  the 
parameters  (a.p.y)  are  not  completely  recoverable.  There  is  an  exact  analog  of 
equation  (3.3)  for  the  discrete  case,  obtainable  from  equation  (3.1). 


Proposition  I  When  the  depth  function  (or  structure)  is  non  planar  the 
following  parameters  are  uniquely  determined  from  the  image  displacement  field: 

(i)  The  rotation  about  the  axis  aligned  w  ith  the  line  of  sight,  i.e.  y. 


(2)  The  ratio  of  the  other  two  parameters,  i.e.  J-. 

Proof:  The  proof  is  by  contradiction.  Consider  the  motion  of  the  non  planar 
surface  / \.  which  is  described  by  the  parameters  (a,./)  ,y:).  The  image  motion 
equations  (from  equation  (3.3)  )  are: 


(3.5) 
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Ay  =  7;  ~  Y’  *  0 

a;  a :  (3.6) 

From  equations  (3.4)  and  (3.5)  the  following  relations  are  obtained: 

/?■>/ •>  -  B\Z  \  -  Ay  i  =  0 

\  (3.7) 

-  a;/;  +  a;/ 1  Ay  >  =  <• 

Now  since  aifi:  *  a?l T  : 


_  -  a;Ay  >  +  /3:Ay  < 
arf.  -  a:/3: 

But  this  is  contrary  to  the  assumption  that  /  ;  is  non  planar.  Therefore: 

Q\  Q; 

Ti =  ^ 

Again,  this  implies  (  considenng  equation  (3.7)  )  that 

Ay  =  0  or  y;  =  y: 

This  completes  the  proof  of  Proposition  I. 


Proposition  II.  The  image  displacement  field  generated  by  a  planar  surface 
is  linear  in  the  arguemenis  u..»  i.  In  addition,  the  parameters  ^  and  y  are  uniquely 

determined  by  the  image  displacement  field  if  and  only  if  ap  +  fiq  =  0.  where  {p.q\ 
is  the  gradient  of  the  planar  surface. 

Proof:  Consider  the  equation  of  the  planar  surface  /u,»  >: 

/  =  p.x  +  q\  +  <7 

If  the  motion  of  the  surface  is  characterized  by  the  parameters  ta.j3.y).  The  image 
motion  (or  optical  flow)  is  gi\en  by: 

u  -  /?<  p.x  +  q)  )  -  y ' 
v  =  -  a(/u  ♦  q\  )  +  y  < 


(3.8) 
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The  above  equation  indicate^  that  for  planar  surfaces  the  optical  flow  is  linear  It  is 
also  true  that  when  the  optical  flow  is  linear  then  the  moving  surface  is  planar. 
Now  considering  equation  (3.3i  and  substituting  for  (u.\)  from  equation  (3.8)  and 
rearranging  terms: 


0  =  {(if-  -  ftp  >«  -  < Pq  -  Pq  -  y  -  yh 
0  =  (-  ap  *  ap  -  y  -  y)x  *  (  -  aq  +  aq  h 

Since  the  above  equations  art  valid  for  the  entire  image  we  have: 


Pr  =  Pr 

(3.10.1) 

-  y  =  Pq  -  y 

(3.10.2) 

-  y  -  ap  -  y 

(3.10.3) 

<  q  =  aq 

(3.10.4) 

Eliminating  p.q  and  y  from  the  ak\ 


f  i  =  i  R  .1  —  n  n  1 


/«  '■  i  -  =  (Pq  ~  ap  I 

o  fi 


i.  /•• .  -  u)fiq  -  ap)  -  aq  -  0 


where  fi-  -j|.  The  above  quad'aiu  equation  has  a  unique  solution  if  and  only  if: 


(fi  ;  -  a;  i  -  lafipq  -  ( fiq  *  ap  )•  =  0 


Under  this  condition: 


P  > 


L  -  L 
‘i  4 


Therefore  the  image  motion  of  planar  surfaces  uniquely  determines  the  parameters 


(4  y if  and  onlv  if  ap  -  p .. 

p  q 


1  -*»  ■*'  »**  ■**  .*• .  • 
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3.2.  Summary 

(1)  The  analysis  under  orthographic  projection  for  both  differential  and  discrete 
motion  are  nearly  identical. 

(2)  When  the  structure  of  the  moving  object  is  known,  the  motion  parameters 
can  be  computed  uniquely  from  image  motion. 

(3)  When  the  structure  is  not  known  then  the  recoverable  parameters  are 

yA).  However  in  this  case,  the  values  are  unique  onh  when  the  moving 

fi  q 

surface  is  non  planar,  or  a  certain  condition  (see  proposition  II  )  holds. 

4.  Analysis  of  Rigid  Motion  for  the  Perspective  Projection  Model 

Under  perspective  projection,  the  "image"  is  formed  by  "rays"  from  points  in 
three  space  (i.e.  world  point'-  >.  1  hese  rays  are  constrained  to  pass  thru  a  nodal 

point  called  the  center  of  pempeetivity.  The  imaging  geometry  is  shown  in  figure 

I  la.  The  nodal  point  is  O.  which  is  also  taken  as  the  origin  of  the  frame  of 
reference.  An  image  point  p  -  <\.y)  corresponds  to  the  world  point  P  =  (X.Y.Z). 
Here  the  focal  length  of  the  imaging  system  is  F.  The  equation  of  the  ray  OP  is  : 

_v  _  y_  _  / 

»  »  F 

Therefore, 

'  =  -y-  >’=  -y-  (4T) 

The  above  projection  is  den-  ted  by  <\  >  ./»-►(  F).  Similarly,  the  projective 

relation  between  another  world  point  P  and  its  image  is  (\  )  ./  )-*(  >  >  /  i  Thus 

from  equation  (4.1)  we  have. 
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Ax  =  x'  -  X  =  Fi-yr  ~  —  ) 

A  l  =  .1  '  -  \  =  /  ( ~r  ~  ~  ) 


A.v  =  F 

At  =  F 


ZAA-AAZ 
Z(Z  +  AZ ) 

.  Z  A  )  -  }'AZ 


(4-2.1) 


(4-2.2) 


Z<Z+AZ) 

Recall  that  when  the  3D  rotation  angles,  characterizing  the  rigid  motion,  are 


"small''  then  the  3D  displacement  components  are  given  b>  the  relations: 


A  \  ~  t ,  -  u.> . )  -i-  u; .  Z 
A  >  -  /.  -  «..V  -  u.’,  Z 
AZ  =  /.  -  w.  A  -  a.',  > 

Thus,  substituting  for  A.V.  A>  and  AZ  in  the  equation  (4.2)  we  have: 


At 


Z(  -»•  u; ,  Z  -  cl.\-  )  )  -  A  (/.*<**,)  -  w ,  A  ) 
Z  -  *  Z I  /  .  +  u' ,  )  -  «  A  ) 


or. 


At 


w.  )/Z  -  /  u.',  -  w  •  t 


1 1 


t 

7 


u: , 


x  ‘ 

T 


similarly,  we  obtain  an  expression  for  the  other  component  of  the  retinal 
displacement. 


-  \t:V7  -  F  U,  +  U;X  - 

Ai  = - - - — 

.  t  X 

1  ♦  —  +  U,'  .  —  U) .  — 

/  *  f  f 

The  above  equations  express  the  the  retinal  displacement  vector  (At. An  at  an 
image  point  P  =  (x.v)  in  terms  of  the  parameter  vector  .7  and  the  "depth" 
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coordinate  Z  for  corresponding  world  point  p  =  (X,Y,Z).  Another  form  of  the 
above  equations  is. 


^x  = 


<  »'0  ~  -Oy 


u.i  -  « 


XV  ^  X- 

F  u>i  F 


,  <:  \ 

1  +  7  +  w‘T- 


L 


v  jr 

r  "  w,7 


(4-3.1) 


Al  = 


O'o  -  > 


IT  \  .X  \ 

“  +  U);J(  -  Wjy  +  0,\  -yr 

,  F_  _  _i 

/  "  / 


(4-3.2) 


where. 


F/,  /> 

;;)  =  (  — > 


Note  here  that,  when  the  displacement  is  purel>  translational 


This  means  that  when  the  rotational  component  of  the  displacement  is  zero,  the 
image  displacement  \ectors  meet  at  one  point  (xc,.io).  That  is  to  sa>.  the  retinal 
displacement  field  converges  to  or  diverges  from  a  single  point  in  the  image  plane. 
This  point  is  called  the  focus  of  contraction  (FOC)  or  the  focus  of  expansion 
(FOE),  depending  on  whether  the  translational  motion  is  directed  awa\  from  or 
towards  the  image  plane  (figure  III). 

If  we  can  measure  the  retinal  displapcement  field  due  to  a  particular  motion. 


then  it  is  possible  to  estimate  the  parameters  characterizing  the  motion.  In 
addition,  if  the  temporal  sampling  rate  of  our  imaging  process  is  high  -  meaning 
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that  the  components  of  the  displacement  for  a  single  time  interval  is  small  and  the 
following  condition  holds. 


~2  +  "  u> y  €  1  (A) 

it  is  possible  to  derive  the  equations  relating  image  motion  to  the  motion 
parameters  in  the  differential  case.  This  is  obtained  by  dividing  equation  (4.3)  by 
a  small  time  interval.  A/,  and  taking  the  limit  as  A;— 0.  The  image  displacement 
then  becomes  image  velocity,  and  is  called  optical  flow.  The  optical  flow  is 
denoted  by  the  vector  (u.v)  where: 

lim  A*  _  dx_  lim  Av  _  d\ 

A/— 0  a,  dt  *  A/— 0  A/  =  dt 

Similarly  the  motion  parameters  are  now  the  translational  velocity  l  =  <r.i  ,n  ) 
and  the  rotational  velocity  u  =  <«./?. yi  where: 


and 


_  lim  i  ■  _  lim  '■  lim 

Si  -»0  Af  A/-*0a/ 


_  lim  o  _  lim  w-  _  lim  u; 

Ar-*0  A/  P  A/  — *0  A/  y  A:—  0  A/ 

Equation  (2.6)  now  becomes: 


U  =  Uo  -  x )-—  +  F/S  -  y  v  -  ayr 


(4-5.1) 


V  =  O-'0  +  ^  (4-5.2) 

w-here  the  3D  motion  is  now  characterized  by  a  translational  velocity  ir.i  .H  )  and 

a  rotational  velocity  ia.fi.y).  Furthermore  the  FOE  is  now  given  by 
Uo-jo)  =  <-FT-rr)- 


Motion  perception  involves  the  computation  of  the  parameters  of  motion 
from  the  image  displacement  field.  The  latter,  becomes  in  the  limiting  case,  a  field 
of  velocities,  called  optical  flow.  The  relation  that  optical  flow  has  with  the  motion 
parameters,  is  embodied  in  equations  (4.5).  These  motion  equations  involve 
velocities,  both  in  3D  as  well  as  in  the  retina.  However,  in  a  practical  vision 
system,  the  retinal  measurements  that  are  actually  made  involve  displacements 
over  a  small  time  interval.  This  means  the  above  velocity  equations,  are  not 
strictly  applicable.  Under  certain  conditions,  the  penalty  paid  for  doing  this  may 
not  be  too  severe.  This  happens  when  the  error  mdroduced  by  the  velocitv 
approximation  is  within  some  predetermined  bounds. 

There  are  two  separate  approximations  embodied  in  the  usage  of  the  equations 
(4.5)  to  express  the  constraints  on  image  motion  due  to  the  3D  motion  parameters: 

(1)  The  three  dimensional  velocity  approximation  -  The  velocitv  of  a  point 
p  =  i.\ .)'./)  on  a  rigid  body.  moving  with  a  translational  velocitv  l  =  ir.r.in, 
and  a  rotational  velocity  u  =  la.p.y >  is  given  by 

=  t  ♦  n  \  p 

ji 

Integrating  the  above  with  respect  to  time  we  have 

dt  =  /  A  (T  +  n  A  pi  di 

Here  A  denotes  the  vector  cross  product.  The  three  dimensional  velocity 
approximation  implies  that,  for  small  A/,  the  image  displacement  can  be 
expressed  as: 


Ap  =  ( A  \  A  >  A/ )  =  I  A/  -t-  (S2A/ )  ,\  p 

(2)  The  retina!  velocitx  approximation  -  This  enables  us  to  treat  retina! 

A/ 

displacements  as  retinal  velocities  and  is  valid  so  long  as  l.  This  can  also 

be  written  as  relation  (A)  stated  previously 

When  both  the  translational  velocity  T  as  well  as  the  depth  function  /  is 
multiplied  b>  the  same  constant,  the  latter  cancels  out  leaving  the  equations  (4.5) 
unchanged.  The  same  applies  t"  the  equations  (4.3).  T  his  means  that  scaling  the 
translation  bv  a  constant  tiki- a.  and  at  the  same  time,  causing  a  depth  dilation  bv 
the  same  factor  leaves  the  imay  displacement  field  unchanged.  Thus,  from  the 
information  available  in  the  image  displacement  field,  the  translation  vector  is 
obtainable,  onh  upto  a  scale  fa-.:  ; 

In  equation  (4.3)  the  depth  variable  /  is  an  unknown.  An  equation  relating 
image  displacement  to  the  m  :  n  parameters  is  obtained  bv  eliminating  --  from 
equations  (4.3): 


,  - —  —  (. 1  -  I 
'  /  / 

\  \ 
•<T-  W.yl 


cO  I  ~  CO  , 


,  y  u 

/  u  ,  U  ■  X  -  Ul,  —  +  U.’  .  ~JT 


» n  -  >  -  A  l 


CO  x  (J  '  u'  Q 
CO  ,  d  ,  U  Q 


U’.l  /  ♦  Ax 

U'.\l  +  Ai 


x c  -  x  -  At 
In-  V  -  A  l 


<p:  =  t  Aa  -  u  <j>  =  f'~  +  At  A  v  +  a - 

<P)  =  /  ’  -  t  A i  i-  i  •  qpj  =  ^  A»  +  aa 

The  above  equation  relates  the  motion  parameters  to  the  image  displacements, 
which  are  observables.  This  is  a  bilinear  equation  in  the  unknown  motion 
parameters.  A  similar  relation  is  obtained  for  the  differential  motion  case,  bv 

eliminating  ~  from  equation  (4.5): 


At  „  X 


u  -  up  -  Y.t  -  -p±r) 

- 4 - L _  =  i«_l  ,4.7) 

t  -  (  -  /  a  -  y.\-ay  *  P^p)  '°  ‘ 

In  the  above  analysis,  the  relations  between  image  motion  and  3D  motion  has 
been  deri\ed  by  assuming  genera!  displacement  of  a  rigid  constellation  of  points  in 
space.  This  relation  is  given  be  equation  (4-5).  From  this.  bv  taking  the  limiting 
case,  for  infinitesimal  displacement,  the  "continuous"  or  differential  motion  case  is 
obtained.  The  latter  relation  can  also  be  obtained  directly  from  the  kinematic 
equations  of  rigid  motion  (  see  Appendix  I  or  [19]  for  details  ). 


4.1.  The  Information  available  in  the  image  displacement  field 

The  foregoing  analysis  illustrates  the  dependence  of  the  optical  flow  field  on 
the  motion  parameters.  In  other  words  3D  motion  constrains  image  motion.  The 
magnitude  of  the  translation  parameter  vector  cannot  be  computed  from  the 
optical  flow  field.  The  rigid  motion  parameters  observable  from  monocular  retinal 
optical  flow  measurements  are  given  by  the  parameter  vector  3 


J  =  lAo,ro.W,.U,.fc!..| 

Now,  we  examine  the  motion  equations  to  see  whether  the  displacement  field 


9 


WJW) 
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uniquely  determines  3. 


The  question  to  be  answered,  before  attempting  the  design  of  algorithms  to 
compute  the  motion  parameters  from  optical  flow  is  whether  such  computation  is 
|  feasible.  This  means  that  given  an  optical  flow  field,  when  can  we  say  that  it  could 

be  produced  by  a  unique  set  of  motion  parameters.  The  following  theorem 
answers  this  question,  by  giving  a  sufficient  condition  for  uniqueness. 

Theorem  I:  The  optical  flow  field  is  uniquely  determined  by  the  rigid  motion 
parameters  when  the  moving  surface  cannot  be  expressed  as  a  rational  function  of 


the  form 


P i(  v,i  > 
<J:  U  ,i  ) ' 


w  here  /\  and  Q:  are  polynomials  of  the  first  and  second  orders 


respectively,  and  <  < )  are  image  coordinates. 

Proof:  Let  a  rigid  surface  /  .  moving  with  translational  and  rotational  velocities 
(C'.r.H  (and  (a'.p'.y ')  respectively  generate  the  optical  flow  field  <«..  >  given  bv 


u  - 


r  -  vn 


1 1 3  -  y  }  ~  a  -f  +  P  Y 


/ 

I  -  t  M 


/ 


la  +  y  i  -  a  -j-  -  P  T 


(4-8) 


where  the  translation  parameter  vector  is  U  i  h  >  and  the  rotational  velocitv  is 


{a  P  y  ). 


Assume  that  there  is  another  surface  zu.v)  moving  with  a  different  set  of  motion 
parameters  but  giving  rise  to  the  same  optical  flow  field  («.»•).  or 


L  -  xW  CD  jr\  0  x- 

u  =  - - - +  Fp  -  yy  -  a—pr  +  /3  — 


/ 

V  - 


/ 


-  t  a  +  yx  -  a^jr  +  fi-j- 


(4-9) 


where  the  3D  motion  is  now  due  to  a  translational  velocitv  <r.i  h  >  and  a 


rotational  velocity  (afi.y). 


Since  u  -  u  =  0  and  v  -  v'  =  0  everywhere  in  the  image,  we  have  from  equations 
(4.8)  and  (4.9): 


—  -  —  zxli-  +  FA/3  -  vAY  -  ^-Aa  +  -^A/3  =  0  (4-10.1) 

*  ~  — V  1  -  -  FAa  *  A  Ay  -  ‘pAa  f  ^A/3  =  0  (4-10.2) 

where.  Aa  =  a  -  a'.  A/3  =  y3  -  /?'.  and  Ay  =  y  -  y  • 

Considering  the  above  set  of  equations  and  solving  for  the  variable  /  we  have: 
(assuming  the  focal  length  F  to  be  unitv  ) 

/M  a  .  i  ) 

Q  :<  a  ,  i  ) 

where 


and. 


/*.(.»,»  >=  in  -  rn+  A<nr -  ruvurir  -  nn  (4.1  D 


Q :  =  (A/3  l  -r  Act  F )  -  a  (Aa  M  -1-  Ay  t  )  -  1 <  A/3  H  -  Ay  •  •  ,  , 

-  a.i  ( Aa  I  -*■  A/3 i  )  *  A  :( A/3  1  -  Ay H  )  *  i  :(AaF  -  Ay  It  ) 

The  above  implies  that  the  surface  /  that  originally  generated  the  optical  flow 

p, 

must  be  a  rational  function  of  the  form  to  permit  ambiguous  interpretation  of 

its  rigid  motion.  This  is  contrary  to  the  the  statement  of  the  theorem.  This  proves 
the  theorem. 

Corollary  I:  When  the  motion  of  a  surface  is  purely  rotational,  the  optica!  flow  field 
is  uniquely  determined  by  the  motion. 


Proof:  In  equation  (4.10)  making  the  substitutions  l"  =  i  =  w"  =  0  we  get: 

-  +  TA/3  -  » Ay  -  - Aa  +  4^0  =  0 

/  r  r 

— - — -  FAa  *  JfAy  ~  +  4^4/3  =  ^ 

/  r  r 

Now,  eliminating  Z  from  the  above  equations  and  setting  focal  length  F‘  to  unit), 
we  obtain: 

(A/?  1  +  A al  )  -  x (Aa  H  +  A yi  )  ~  i  (A/3  U '  +  Ay  T) 

-  jo  (Aa  I  +  A/30  +  A  -( A/3  1  +  AyM  )  +  >?<AaT  +  Ayli  )  =  0 

From  the  above  equation  we  have  a  set  of  six  equations: 

Aa  L  +  A/3  i  =  0 
Aa  l  +  A/3 1  =  0 
A/3  T  +  Ay  H  =  0 
A  at  +  Ay  if  =  0 
Aa  H  *  Ayr  =  0 
A/8  H  +  Ay  I  =  ft 

The  above  equations  implv  either  i  =  i  =  It  '  =  0  or  Aa  =  A/3  =  Ay  =  0. 

Both  these  conditions  mean  that  the  optical  flow  field  due  to  a  pure  rotational 
motion  has  a  unique  interpretation.  This  proves  the  corollarv. 

Corollary  II:  It  is  possible  for  a  flow  field  generated  by  pure  translators  motion  to  be 
identical  to  one  generated  by  another  flow  field  due  to  both  translation  and  rotation. 
In  other  words  convergence  of  the  flow  vectors  directly  onto  a  point  on  the  image 
plane  does  not  imply  purely  translator y  motion. 

The  truth  of  the  above  corollary  will  be  demonstrated  by  a  numerical  example. 
Consider  two  flow  fields  generated  by  different  surfaces  undergoing  different 
motions: 


In  the  first  case  the  motion  is  due  to  a  planar  surface  given  bv  the  equation  : 


The  motion  is  rigid  and  is  specified  bv 


Uo  =  -  -0 o  =  a  =  5./?  =  3,-y  =  0) 

Assume  the  translation  in  depth  to  be  unity.  Then,  from  equation  (4.8)  we  have. 

u=(x  -  -^-Kl  -  4-jt +  7-v )  -  3+5.W  -  3.V* 

2  2  o 

3  35  35  1  -  1 

w=  -  7(+f+tx-'  -  r**y 

35  139  35  •>  5 

V  =  —  v  -  t  +  — -  \  -  -  —  .W  -  — 

12  36  '  6  2  6 

In  the  second  case  the  motion  is  due  to  the  planar  surface  given  b>  the  equation  : 

/=  2 v "  >  t1 

and  the  motion  is  specified  b\  the  parameter  vector 

(.*o=  -  =  7- a  =  0.  /3  =  0.  y  =  0) 

O 

The  optical  flow  field  in  both  the  examples  are  identical. 

The  question  of  multiple  interpretations  of  the  same  flow  field,  has  received 
some  attention  in  the  literature.  The  foregoing  example  illustrates  the  fact  that 
motion  of  planes  can  be  potentially  open  to  more  than  one  interpretation.  It  is 
known  (  see  [27-29,34])  that  the  motion  of  planes  have  dual  interpretations. 
Uniqueness  of  interpretation  for  planes  requires  three  views  of  four  points,  or  two 
views  of  seven  points  which  uniquely  define  two  planes  neither  of  which  pass 
through  the  origin.  In  another  study  Fang  and  Huang  [9]  showed  that  nine  points 
not  lying  on  a  second  order  surface  passing  through  the  origin  can  be  used  to 


determine  the  motion  parameters  uniquely.  Another  significant  theoretical  result  is 
due  to  Longuet-Higgins  [20],  and  Tsai  and  Huang  [30].  where  eight  points  are  used 
to  solve  for  the  motion  parameters  from  a  set  of  linear  equations.  The  important 
question  as  yet  unanswered  arc.  under  what  conditions  the  optical  flow  field  is 
inherently  ambiguous  and.  what  is  the  degree  of  the  ambiguitv  possible  in  optical 
flow  fields.  The  following  analysis  answers  these  questions. 

Theorem  II.  Under  the  assu  •>:  of  rigidity,  an  optica!  flow  field  is  amenable  to  at 
most  three  interpretations. 

Proof:  Theorem  I  shows  that  the  optical  flow  field  is  enough  to  determine  the  rigid 
motion  parameters  uniquel,  for  most  surfaces.  It  was  seen  however  that  in  case  of 
certain  rational  functions  there  is  potential  ambiguitv  in  the  interpretation  of 
motion.  These  are  the  ration.i'  functions  belonging  to  the  class  R::.  and  written  as 

/  - *  hx  '-fi-,  —,  (4.13) 

:  ■  ■  *  1 1  i;o  ♦  n  o  -  a  ■ 

Planar  surfaces  belong  to  the  .move  class  of  surfaces.  It  has  been  mentioned 
previously  that  planar  surface''  van  have  at  most  two  interpretations.  When  a 
surface  is  non  planar,  to  have  multiple  interpretations  of  its  motion,  it  must  be  of 
the  type  given  by  equation  (4.13)  with  the  added  propertv  that  there  is  no 
common  factor  between  the  numerator  and  the  denominator. 

Let  such  a  surface  be  undergoing  rigid  motion  (u.\. w.afi.y).  Let  there  be  another 
motion  (r.)  .n  .a-  ka.fi-  a/<  ■;  -  a v i  that  produces  an  identical  flow  field.  Then 
from  equation  (4.11)  we  have 


where  k  is  some  constant  factor.  Since  by  definition  of  the  class  R1  at  least  one  of 
a ,  b  and  c  must  be  non  zero,  therefore  A  *0.  This  is  because  if  k  is  zero  then 
from  the  above  set  of  three  equations  we  get  the  result  that  the  translations  < u.\ .*  ) 
and  U  J  Ji’)  are  identical  upto  a  scale  factor.  Hence  by  Lemma  I  of  Appendix  I. 
the  motion  is  not  ambiguouv 

Multiplying  the  first  equation  b\  u,  the  second  b\  v.  and  the  third  b\  *  and 
adding  the  three  equation>  we  ha\e 

i  *  t\  ~  c'H  )  k  =  0 

This  means  that  the  motion  uin  onl>  be  ambiguous  when 


-  *v  -  (H  =  0 


(4.15) 


Similarlv  it  can  be  shown  that 


,/  -  hi  -  rJJ  =  i)  (4.16) 

Again  companng  the  denotv.  mat. -r  of  the  rational  function  with  equation  (4.12). 

and  combining  the  constant  -  wth  the  translation  parameter  <(.)  u  >: 


a/?i  -  Au (  =  j 

(4.17) 

An  M  *  Ay  L '  -  -  c 

(4.18) 

A/tlt  ♦  Ayr  =  -  / 

(4.19) 

Art  1  ♦  A/?(  =  -  g 

(4.20) 

A/1)  +  AyH  =  h 

(4.21) 

An (  +  Ay  0  =  ; 

(4.22) 

From  equations  (4.17).  (4.21 )  and  (4.24)  we  get: 
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2 AaT =  q 

(4.23) 

2A 01  =  r 

(4.24) 

2AyH  =  5 

(4.25) 

where  q  =  d  +  /  -  h.  r  =  d  -  i  +  h.  s  =  -  d  +  /  +  h.  Substituting  from 

equations  into  equations  (4.18).  (4.19)  and  (4.20): 

the  above 

ql  ■  +  rt  '  +  2*1  T  =  0 

(4.26) 

rit :  si-  +  2/ nr  =  o 

(4.27) 

qC:  +  sli  :  -r  2 ei\t  =  0 

(4.28) 

Equations  (4.26),  (4.27)  and  (4.28),  together  with  equation  (4.16)  can  admit  no 
more  than  two  solutions.  This  is  because  at  least  one  of  (q.r.i.e.f.z)  must  be 
nonzero.  Therefore,  since  there  can  be  at  the  most  two  spurious  solutions  (recall 
that  the  veridical  solution  corresponds  to  k  =  0).  the  implication  is  that: 

When  the  optical  flow  field  has  more  than  one  interpretation,  the  number  of 
globally  consistent  solutions  for  the  motion  parameters  can  be  at  most  three. 

This  completes  the  proof  of  the  theorem. 

It  will  be  shown  that  there  exist  surfaces  whose  rigid  motion  induces  optical 
flow  that  is  compatible  with  three  distinct  interpretations.  This  fact  explains  wh\ 
Longuet-Higgins  and  Prazdnv  [19]  noted,  that  from  local  optical  flow  constraints 
and  their  derivatives  three  interpretations  of  the  motion  are  possible  since  the 
constraint  equations  were  cubic. 

An  example  of  2D  motion  field  w  ith  three  distinct  rigid  motion  interpretations'. 


The  equation  of  the  moving  surface  is 


gX\ 

the  motion  parameters  are  ir,(',0.a.j3.y)  the  expression  for  optical  flow  is  therefore 

u  =  i"  gx\  -  ax\  t  fiix2  +  1 )— y.i 
v  =  1  'g.n  -  a(_\  '  +  1)-*-  fixy  +  y.x 

Alternative  interpretation  l: 

7  =  f[ r.u  -  +  in 

where  the  motion  parameters  are  it  .O.O.a.fi  +  gi'.y).  The  optical  flow  field  is  given 
b\ 


U  ;  =  i'—[Cxy  -  I '  (  A  :  +  1  »1  -  ax\  +  (fi  +  gl'  )<  A  -  -*•  1  I  -  Y.l 
v ;  =  -  a(.i :  +  1 )+(/?-  g  l '  )  x  i  +  y  x 
Alternative  interpretation  If: 

1  =  -  n.r+  I)] 

The  motion  parameters  are  (0.)  .0 .a  -  gr.fi. y).  The  optical  flow  field  is 
u ; (  =  -  (a  -  g{ '  ).n  ~  fi(  \  ~  1  > — y  i 

v;  =  I  I  'x\  -  i'{  \  •  1  )J  -  <a  -  g(  H  i  *  -  1)  -t-  fi  x}  *  y  x 

It  is  easiK  verified  that  u  =  u\  =  and  i  =  » ;  =  i> 

Theorem  I  states  that  under  certain  cases  the  optical  flow  field  ma>  not 
indicate  the  motion  parameters  uniquel}.  The  next  theorem  shows  how 
unambiguous  determination  of  the  motion  parameters  can  be  achieved  from 
optical  flow  data. 


Theorem  111:  Given  the  optical  flow  values  at  three  non  collinear  retinal  locations, 
where  the  temporal  derivative  (or  time  difference)  of  the  flow  is  nonzero,  the  motion 
parameters  are  uniquely  determined. 


Proof:  The  essential  fact  on  which  the  proof  is  based  is  that  the  rotational 
component  of  optical  flow  is  not  dependent  on  time.  Thus  if  during  a  short 
observation  period  the  parameters  of  motion  remain  fixed  then  the  temporal 
derivative  of  the  flow  is  onlv  dependent  upon  the  change  in  the  translational 
segment  of  the  flow.  Although  the  following  proof  uses  temporal  derivatives, 
differences  also  lead  to  the  same  result. 
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retina!  locations  where  the  temporal  derivatives  of  the  flow  are  non  zero.  Once  the 
FOE  is  determined,  the  rotational  velocity  can  also  be  uniquely  computed.  (Note 
that,  instead  of  temporal  derivatives,  differences  can  also  lead  to  the  same  result.) 

Another  way  of  resolving  the  ambiguitv  in  the  optical  flow  is  by  using  shape 
information.  There  is  a  strong  relationship  between  the  parameters  of  motion,  the 
optical  flow  field  and  the  structure  of  a  mov  ing  surface.  The  follow  ing  propositions 
makes  this  concept  clear. 

Proposition  I.  When  the  parameters  (i.e.  x0.\0.afl.y  )  describing  the  motion  of  a 
rigid  surface  are  known  then  the  structure  of  the  surface  is  unique !\  determined  from 
the  optical  flow  field. 

Proof:  The  proof  is  evident  from  equation  (4.5).  Note  that  we  can  obtain  the 
depth  function  upto  a  constant  dilation  factor  W.  In  other  words  the  ratio  of 
depths  at  anv  twu  image  points  can  be  computed. 

Proposition  II.  When  the  structure  of  a  surface  is  known  then  the  parameters 
describing  its  rigid  motion  are  uniquely  obtained  from  the  optical  flow  generated  by 
the  motion. 

Proof:  See  Appendix  II. 

Even  the  partial  specification  of  shape  can  lead  to  a  correct  perception  of 
rigid  motion.  A  illustration  of  the  fact  that  shape  information  can  disambiguate 
between  alternative  motion  interpretations  comes  from  the  next  theorem. 


Theorem  IV:  The  motion  of  a  planar  surface  whose  direction  of  translation  does  not 
lie  in  the  plane  of  its  surfact  normal  and  the  line  of  sight,  can  be  interpreted 
correctly  from  the  optical  flow  generated  when  the  tilt  of  the  plane  is  known. 

Proof:  Let  the  equation  of  the  planar  surface  be 

d 

1  ~  px  -  q\ 

where  (p,q)  is  the  orientation  of  the  depth  plane  and  d'  is  the  distance  from  the 
origin  along  the  z  axis  (e.g  line  of  sight  ).  Substituting  the  above  into  equation 
(4.5)  and  observing  that  we  can  ignore  multiplication  of  the  translational 
parameters  bv  a  constant  (Mieh  d  )  since  we  can  compute  the  former  upto  a 
scale  factor  anvwav.  we  have 

v»  -  ,\l  *  1A.X\  +  /<.»- 

.'o  ~<\.U  -  .V' 

where  the  unknowns  {  a,  }  are  given  bv 

L*/3=  o 

I  p  -  U  =  ,\ 

t  q  -  y  =  h 

H  q  -  a  =  /4 
Up  +  p  =  /j 
I  -  a  =  /6 
y  -  \'p=ln 

**/■*■  ^  -  f 

Note  that  (4.32)  are  linear  homogeneous  equations  in  eight  unknowns.  Thus  if  we 


(4.31) 


(4.32.1) 

(4.32.2) 

(4.32.3) 
(4-. 32.4) 

(4.32.5) 

(4.32.6) 

(4.32.7) 

(4.32.8) 


can  solve  for  the  synthctu  parameters  {’  }  bv  making  measurements  at  four 


X 


suitable  points,  and  in  addition  can  measure  the  tilt  of  the  depth  plane. i.e. 


Q 

Then  from  (4-32.7)  and  (4-32.8)  and  (4-33)  we  have: 

7  —  tH  =  /-  +  r/g 

From  (4-32.2),  (4-32.3)  and  (4-33)  we  have  : 

TV  -  H  =  T I X  -  1 2 

Therefore,  since  r:  *  i*o  we  have: 

.’-  -  r(.\  -  !■>)  +  t'\ 


ii  •  T(/~-  M-  T'Vs 

r  -  1 

Now  if  U  *  /g  (  i.e.  ^  *  0  )  we  ha.c  from  (4-32.8)  and  (4-32.3): 


(4.33) 

(4.34.1) 

(4.34.2) 

(4.34.3.1) 

(4.34.3.2) 


l 


-  =  k 


I  1  .  5  -  It 

otherwise  if  /-  *  0  (i.e.  p  *  '<  )  have  from  equations  (4-32.7)  and  (4-32.2): 


(4.34.4) 


r  /:-»• 


-  I. 


=  k 


(4-. 34. 4) 


(if  both  p  and  q  are  zero  then  the  parameters  are  easilv  solved  for  ) 


Now  from  (4-34.4),  (4-32.6)  and  (4-32  1)  we  have: 

k  a  -r  fi  =  1  -  kib  (4.34.5) 

Also  from  (4-32.5)  and  (4-32.4)  we  have: 

t«  +  p  =  .'5  -  t/4  (4.34.6) 

Therefore,  since  r  *  k.  from  the  assumption  made  in  the  statement  of  the 


theorem,  then  equations  (4-34  5 1  and  (4-34.6)  are  independent,  and  we  have: 
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a  = 


( 1  -  A.76)  -  {U  —  rli) 
k  -  t 


(4.34.7.1) 


fi  = 


A' ( / 5  —  T/4)  -  r ( 1  -  k!b) 
k  -  t 


(4.34.7.2) 


Now  U  and  V  can  be  determined  from  equations  (4-32.6)  and  (4-32.1).  Thus  we 
have  determined  the  motion  parameters  uniquely  from  the  optical  flow  and  tilt 
information. 

At  this  point  it  may  be  mentioned  in  passing  that  it  is  possible  to  obtain  the 
motion  parameters  uniquely  from  the  optical  flow  generated  by  two  planes  moving 
together  rigidly.  In  this  case  the  optical  flow  is  locally  second  order.  If  the  eight 
synthetic  parameters  are  now  measured  at  two  different  regions  of  the  flow  field 
then 


‘ 

L  A-£  +  H  A—  =  A/; 

u  u 

I  A^  =  A/, 

Vi 

II  iC  = 


^  A  j  -  ^ 

-  AA^  =  A /? 
a 

I'A^  +  HA-  =  A/8 
a  a 

where  the  two  planes  involved  in  the  motion  are  given  by  = 


(4.35) 


and 


px  +  q\  +  1 

.  The  A  operator  in  front  of  any  quantity  denotes  the  difference 


p  x  +  q  \  +  1 

of  the  corresponding  parameters  for  the  two  planes,  e.g.  \p  -  £-  - 
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The  above  equations  imply  that  when  at  least  one  of,  A p  or  a q  or  A-^  is  non  zero 

u 

the  translational  parameters  are  uniquely  determined.  Hence  in  such  a  case  the 
rigid  motion  parameters  are  determined  uniquely  from  the  optical  flow  field  (see 
Appendix  1).  Therefore 

When  two  planes .  neither  of  which  pass  through  the  origin ,  move  rigidly 
together,  their  motion  is  uniquely  determinable  from  the  opical  flow  field 
generated. 

4.2.  Summary  and  Discussions 

The  analysis  presented  here  leads  to  considerable  insight  into  the  3D  motion 
interpretation  problem.  Previous  results  (e.g.  [9.30])  bv  Huang  and  his  colleagues 
presented  sufficient  conditions  for  uniqueness  of  three  dimensional  motion 
interpretation,  since.  thev  were  concerned  with  specific  algorithms.  The  work, 
reported  here,  deals  with  necessarv  conditions  for  unique  interpretation  of  3D 
motion  from  the  optical  flow  field. 

While  the  surface  denoted  bv  equation  (4.13)  does  mean  second  order 
surfaces  containing  the  nodal  point  of  the  camera,  it  is  certainly  true  that  all  such 
surfaces  do  not  admit  ambiguous  interpretations  of  their  3D  motions.  Multiple 
interpretations  require,  in  addition,  that  the  the  constraints  given  b>  (4.16).  (4.26), 
(4.27)  and  (4.28)  all  be  satisfied. 

Thus  consider,  an  algorithm,  such  as  Prazdm’s  [22],  where  nonlinear  (and 
independent)  flow  constraints  at  five  retinal  locations  are  used  to  obtain  a  3D 
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motion  interpretation.  It  is  now  possible  to  answer  the  question  as  to  whether  the 
solution  obtained  is  the  only  one  possible.  Since  now  a  set  of  motion  parameters 

is  known,  from  equation  (4.5)  the  relative  depth  ~  can  be  obtained  at  the  five 

w 

retinal  locations.  The  latter,  when  substituted  into  equation  (4.13),  generates  five 
linear  equations  in  the  surface  parameters  j.b.c.d.c.i'.g.h.'..  These  together  with 
the  four  constraints  (4.16),  (4.26),  (4.27)  and  (4.28)  constitute  nine  linear 
homogeneous  equations  in  the  nine  surface  parameters.  Therefore  uniqueness  of 
interpretation  is  possible  if  the  determinant  of  the  abov  e  system  is  zero.  Which  in 
turn  implies,  that  all  the  surface  parameters  must  be  zero.  This  makes  it 
impossible  to  construct  any  other  interpretation  from  measurements  at  the  five 
retinal  locations,  guaranteeing  that  the  solution  obtained  is  the  only  one  possible. 

5.  Computational  Techniques  for  obtaining  the  Rigid  Motion  Parameters 

The  main  difficulty  in  computing  the  3D  Rigid  Motion  parameters  is  that  the 
equation  constraining  the  image  motion  to  the  3D  motion  is  nonlinear.  Another 
complication  arises  from  the  high  dimensionality  of  the  parameter  space.  If  it 
were  possible  to  separate  the  component  of  the  image  displacement  due  to 
translation  from  that  due  to  roation  we  could  have  efficient  algorithms  for  the 
computation  of  the  3D  motion. 

The  constraint  equations  developed  by  Longuet-Higgins  and  Prazdny  [19)  are 
used  by  Bruss  and  Horn  [6]  to  arrive  at  the  parameter  set  that  minimizes  the 
square  of  the  errror  between  the  measured  optical  flow  and  the  flow  computed 


from  the  parameter  constraint.  In  general  such  a  technique  will  give  rise  to  a 
system  of  non-linear  equations  from  which  the  parameters  must  be  computed 
using  some  suitable  iteration  scheme.  Longuet-Higgins  and  Prazdny  mention  the 
possibility  of  using  motion  parallax  to  simplify  the  computation  of  the  global 
motion  parameters.  Lawton  and  Rieger  [24]  uses  a  similar  idea  to  factor  out  the 
rotational  component  of  the  optical  flow  at  depth  discontinuities  or  regions  where 
the  depth  gradient  is  large.  This  method  is  not  reliable  since  it  hinges  upon  the 
ability  to  compute  flow  vectors  reasonably  accurately  at  discontinuities.  Since 
almost  all  algorithms,  to  date,  for  computing  optical  flow  face  problems  at  regions 
where  the  field  is  sharply  discontinuous. 

5.1.  Computing  Rigid  Motion  Parameters  From  Optical  Flow 

Attempts  at  segmenting  the  parameter  space  of  rigid  motion  into  translational 
and  rotational  components  can  be  termed  marginally  successful,  at  best.  A  simple 
way  to  estimate  the  motion  parameters  from  the  bilinear  flow  constraint  equation 
(2.10)  is  by  means  of  the  hough  transform  technique  [2.5].  There  are  two 
problems  that  are  immediately  apparent,  namely,  the  nonlinearity  of  the 
constraint,  and  the  large  dimension  (  e.g.  five)  of  the  parameter  space.  Another 
method  is  to  linearize  the  constraint  equation  by  writing  (2.10)  as  a  linear  equation 
in  eight  parameters.  Obviously  these  eight  parameters  are  each  functions  of  the 
values  of  the  five  actual  parameters.  This  implies  that  linear  least  square  methods 
are  not  applicable  here,  since  the  eight  synthetic  parameters  are  not  independent 


of  one  another.  Finalh  it  is  shown  that  the  information  in  the  variation  in  the 
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optical  flow  field.i.e.  the  spatio  temporal  derivatives  of  the  flow  field  facilitate  the 
computation  of  the  motion  parameters. 

5.2.  The  Analysis  of  General  Motion 

Here  the  situation  is  complicated  by  the  fact  that  we  have  to  determine 
several  sets  of  parameters,  corresponding  to  the  several  bodies  in  motion.  This 
problem,  is  obviously,  quite  hard  and  is  still  open.  It  has  been  studied  in  restricted 
domains  by  Fennema  &  Thompson  [11]  The  Hough  transform  technique  proposed 
earlier  in  this  paper  still  works.  The  only  difference  is  that  we  have  to  look  for 
multiple  peaks  in  the  parameter  space  after  houghing.  These  then  would,  of 
course,  correspond  to  the  parameter  set  of  the  various  bodies  in  motion  with 
respect  to  the  sensor. 

Methods  involving  the  spatial  derivatives  of  the  optical  flow  can  again  be 
applied.  There  is  no  known  technique  for  obtaining  the  optical  flow  in  all  tvpes  of 
imaging  situations.  Also  the  computed  flow  field  is  noisv.  to  sav  the  least.  This 
difficulty  is  compounded  when  we  consider  the  case  where  several  bodies  are  in 
motion  with  respect  to  the  sensor.  Thus  obtaining  spatial  derivatives  of  the  flow 
may  not  be  practically  possible  over  large  portions  of  the  image  frame. 

Recently  a  way  of  determining  motion  parameters  from  3D  flow  has  been 
suggested  [3],  This  method  is  amenable  to  adaptation  to  the  general  motion  case.  It 
is  not  clear  as  to  how  difficult  it  i>  to  compute  the  3D  flow  in  this  case.  However, 
it  can  be  shown  that  in  cast,  a  depth  map  can  be  obtained  (by  some  stereo 
matching  technique),  the  31)  map  can  be  calculated. 


5.3.  Algorithms  for  motion  perception 

Computer  algorithms  for  determinig  the  parameters  of  rigid  motion  will  now 
be  discussed  in  the  light  of  the  various  ideas  put  forth  in  earlier  sections.  The 
treatment  will  consider  both  orthographic  and  perspective  projections,  as  well  as 
differential  and  discrete  motions.  In  some  of  the  cases  the  steps  of  the  algorithms 
will  be  described  with  a  fair  amount  of  detail.  In  others  details  will  be  omitted, 
particularly  when  the  algorithm  in  question  has  a  structure  which  is  similar  to  one 
already  described.  In  all  of  the  algorithms  the  Hough  Transform  technique  (see  [2] 
for  details)  is  used  to  compute  the  desired  global  parameters  from  sets  of 
constraint  equations  obtained  at  different  image  (or  retinal)  locations.  It  should  be 
noted  that  least  square  error  minimization  techniques  are  also  applicable  in  most 
cases. 

For  the  sake  of  simph.it  >  th*.  motion  of  a  single  rigid  bodv  is  considered  To 
extend  the  following  method'  n  the  motion  of  several  moving  bodies,  either  the 
image  motion  field  has  to  K  segmented,  or.  when  hough  transform  is  used, 
multiple  modes  have  to  be  detected  in  the  parameter  "voting"  distribution. 

Recall  that  for  the  case  of  differential  motion,  optical  flow  is  denoted  bv  i u.\ ). 
the  translation  parameters  (veKvitv)  bv  ((.(  .»  )  or  uc,  =  -^,i0=  and  the 


rotational  parameters  by  <«/< 


5.4.  Differential  motion  under  Orthography 


This  case  has  been  analyzed  by  Hoffman.  Sugihara  previously  [13.26], 
Hoffman's  shows  that  motion  parameters  are  not  uniquely  determinable  from  local 
analysis  of  optical  flow.  However,  this  is  not  the  case  for  global  analysis 
techniques.  It  has  been  previously  shown  that,  for  non  planar  surfaces,  global 
analysis  will  give  rise  to  unambiguous  results.  Sugihara  computed  structure  from 
two  optical  flow  frames.  Another  interesting  result  was  obtained  by  Aloimonos  [1] 
where  it  is  shown  when  absolute  depth  can  be  recovered  under  pure  rotation 
under  orthography  when  shape  is  known.  Under  orthography  the  translational 
part  of  the  optical  flow  field  is  constant  and  hence  the  translational  parameters  are 
not  computable.  Hence  motion  parameters  here,  always  refer  to  the  rotational 
velocity  parameters  la.p.y). 

The  relevant  equations  are 


lu  =  /3A:  -  yA\ 

A i  =  -  a Ar  *  yA< 


(5.1) 


where  the  A  symbol  denotes  that  the  following  quantity  is  a  difference  obtained 
from  measurements  made  at  two  different  retinal  locations.  The  relation  between 
the  surface  gradients  and  the  optical  flow  derivatives  are: 


Algorithm  I:  Motion  parameters  from  image  motion  and  structure  information. 
The  simplest  instance  is  when  the  structure  of  the  moving  object  is  known.  In  the 
discrete  case  the  relative  depth  function,  A/u.v),  values  are  enough  to  compute 
the  parameters  (a.p.y)  uniquely  from  the  linear  equation  (5.1).  For  the  differential 

case  structure  or  shape  can  be  represented  b\  the  surface  normals  <4^  4r~).  If  the 

0  \  o\ 

surface  normals  are  known  everywhere.  then  we  can  integrate  the  surface  normals 
to  obtain  the  depth  upto  a  constant  additive  term.  In  other  words  A/u.o  is 
computable.  In  this  case  measurement  of  optical  at  three  non  collinear  points  is 
enough  to  compute  the  rotational  parameters.  Flowever.  if  the  surface  normals  are 
only  known  at  sparse  locations,  but  the  optical  flow  field  is  locally  known  at  these 
locations  then  we  can  use  equation  (5.2)  for  computing  the  rotation  parameters.  In 
this  case  we  are  relying  on  the  fact  that  the  first  derivatives  of  the  flow  can  be 
reliably  computed.  This  is  possible  when,  in  the  neighbourhood  of  the  points  of 
interest,  the  optical  flow  values  have  been  measured  at  enough  locations  so  as  to 
allow  analytic  reconstruction  of  the  optical  flow  function.  Finally  note  that,  if  the 
motion  parameters  are  known  then  the  structure  can  be  obtained  from  the  image 
motion  for  both  the  discrete  and  the  differential  cases.  The  steps  in  the  algorithm 
are: 

1.  Set  up  a  three  dimensional  accumulator  array  for  the  rotation  parameters: 


2.  For  e\ery  point  in  the  image  where  optical  flow  and  surface  normals  are 
known,  select  the  constraint  equation  (5.1)  if  the  estimated  measurement  error 
in  the  surface  normal  function  is  less  than  that  estimated  for  the  optical  flow 
function;  otherwise  select  equation  (5.2). 

For  all  values  of  ia.p.y): 

If  (a./ 3,7)  satisfies  the  constraint  equation  selected 

3.  Obtain  the  maximum  value  in  the  accumulator  arrav.  The  corresponding 
indices  are  the  desired  values  for  the  rotation  parameters. 


Algorithm  II:  Motion  parameters  and  structure  from  image  motion.  W  hen  the 
structure  is  not  known  then,  considering  the  differential  case  and  eliminating 

from  equations  (5.2)  : 

3  a  d\ 


3i  a  du 

dx  ~  p  d.x*  y 

du  p  3' 

3 1  a  3.i  y 

Similarly,  eliminating  A/  from  equation  (5.1): 

fiu  -  yx  +  fiyx  +  1=0 

where  p  =  j. 


(5.3.1) 

(5.3.2) 


(5.4) 


It  is  easv  to  obtain  quadratic  equations  in  either  y  or  ^  from  the  equations  (5.3). 


This  means  that  in  general,  at  everv  image  location,  from  the  measurement  of  the 
spatial  derivatives  of  the  optical  flow  at  most  two  sets  of  values  of  the  parameters 
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i^r.y.-)  may  be  obtained.  However,  if  some  global  assimilation  technique,  like  the 
p  Q 

hough  transform  (  see  [2]  )  is  used.  then,  as  shown  previously,  if  the  moving 
surface  is  non  planar,  only  one  set  of  parameters  will  be  globally  consistent.  An 
exactly  similar  method,  but  using  differences  of  image  displacements,  can  be 


devised  for  the  discrete  case  starting  from  equation  (5.4). 

5.5.  Differential  motion  under  Perspective 

The  relation  between  the  optical  flow  and  the  motion  parameters  is  given  by  the 
equation: 


L 


l  -  x\\  m  ,  „ 

u  -  - 7 - aw  -  1)  -  y\ 


/ 

I  -  ill 
/ 


-  a(.\ :  -  1)^/3  y\  +  yx 


(5.5) 


From  the  above  we  obtain,  by  eliminating  /: 

u  q  u  -  fii  \-  +■  I )  *  y  \  _  (.  -  .>  H 
\  1 )  -  /?  a  i  *  7  i  *  -  M 

Observe  from  the  right  hand  side  of  the  above  equation,  that  its  value  is 


(5.6) 


unchanged  when  the  translational  parameters  are  multiplied  by  some  constant. 
Hence  we  can  determine  the  translational  parameters  only  upto  a  scale  factor.  If 
we  assume  that  w  *  o  then  the  previous  equation  can  be  written  as: 


(5.7) 


u  an  -  fi(x:  -x  1)  +  y  \  _  x  0  x 

v  +  a{  \ :  +  1)  -  fix\  +  y  x  .>  o  _  ) 

If  H  =  o  then  (5.6)  reduces  to: 

u  +  a  x\  -  fiix~  +  \)  +  y\  _  _T 

i  t  a{  i :  -  l)-  fix)  +  yx  L 

Equations  (5.6),  (5.7)  and  (5.8)  are  bilinear  in  the  translation  and  the  rotation 


(5.8) 
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parameters.  This  nonlinearity  makes  it  difficult  to  combine  constraints  from 
different  image  locations  to  compute  the  motion  parameters.  To  summarize,  the 
problems  with  computation  of  motion  parameters  are: 

1.  The  constraint  equations  are  nonlinear. 

2.  The  parameter  space  is  of  high  (e.g.  five)  dimensionality. 

Algorithm  III:  Hough  transform  in  5D  parameter  space.  This  type  of 
algorithm  can  be  easily  realised  by  simple  parallel  neuronal  hardware  (see  [10]). 
The  parameters  that  are  to  be  determined  are  the  polar  angles  (or  direction 
cosines)  representing  the  directions  of  translation  and  rotation,  and  the  magnitude 
of  the  rotation  vector.  This  representation  for  the  rigid  motion  parameters  is 
convenient  since  the  paranno t  >ubspaces  representing  directions  in  space  become 
easy  to  quantize  by  mean-.  sirT,  as  geodesic  tessalation  of  the  gaussian  sphere.  The 
steps  in  the  algorithm  are 

1.  Select  a  coarseness  scale  lor  the  parameter  subspaces.  For  instance,  how  many 
distinct  directions  in  sp.Rc.  the  range  of  values  estimated  for  the  rotation 
magnitude  and  the  sampling  interval  in  this  range.  Initialize  the  parameter 
units  belonging  to  the  hough  transform  space  (this  is  the  five  dimensional 
accumulator  array  where  rh,  "votes"  for  every  parameter  vector  is  tallied). 

2.  For  all  retinal  locations  where  optical  flow  has  been  measured  do  step  3: 

3.  For  all  possible  parameter  values  (i.e.  values  of  the  parameter  quintuple) 
admitted  in  step  1.  do: 


(i)  If  the  direction  of  the  translational  velocity  is  not  parallel  to  the 
image  plane  select  equation  (5.7)  else  select  equation  (5.8). 

(ii)  If  the  parameter  values  satisfy  the  chosen  constraint  equation 
vote  for  the  corresponding  parameter  vector. 

4.  Find  the  parameter  quintuple  that  has  received  the  maximum  number  of 
votes. 

5.  Restrict  the  parameter  space  to  a  neighbourhood  of  the  selected  parameter 
quintuple.  Repeat  the  steps  from  2  to  4  after  choosing  a  finer  parameter  space 
quantization. 

6.  If  the  error  due  to  the  parameter  quamzation  is  acceptable  then  stop  and 
return  the  parameter  value'  .oroputed.  Otherwise  repeat  step  5. 

Some  Remarks : 

(i)  The  space  and  time  required  by  the  algorithm  is  reduced  by  periodically 
examining  the  parameter  accumulator  units  and  purging  those  that  have 
collected  only  a  few  "votes'*  compared  to  the  top  contenders.  This  is 
possible,  since  it  is  assumed  that  the  noise  in  the  optical  flow  data  is 
uniformly  distributed  in  retinal  space. 

(ii)  The  confidence  of  the  u-mpuied  parameter  quintuple  is  the  ratio  of  the  votes 
it  received  to  the  maximum  votes  possible. 

(iii)  If  in  step  4  instead  of  a  dear  winner,  a  number  of  contenders  are  found  then 
step  5  might  have  to  be  repeated  for  each  of  these  for  finer  resolutions.  Then 


the  dinner  is  the  parameter  quintuple  that  comes  thru  with  the  highest 
confidence. 


(iv)  If  it  is  estimated  that  p%  of  the  optical  flow  values  is  corrupted  by  noise,  then 
the  acceptable  confidence  of  the  result  is  (100-  p)%  with  a  tolerance  of,  say 
105. 


Algorithm  1.  performs  well  when  the  quantization  of  the  parameter  space  is 
not  "too  coarse".  This  is  mainly  due  to  the  nonlinearity  of  the  constraint  equation 
used.  This  problem  can  be  alleviated  by  linearizing  the  constraint  equation. 
Although  in  this  case  the  price  we  pay  is  that  the  dimensionality  of  the  parameter 
space  increases.  In  the  following  discussion  it  is  assumed  that  the  not  all  the 
translational  velocity  components  are  zero.  This  is  a  valid  assumption  since  it  has 
been  shown  in  a  previous  section  that  the  motion  parameters  for  pure  rotational 
motion  are  uniquely  detectable. 

From  equation  (5.6)  we  have: 


(\u  -  xv  )U  *  vL  -  u  I  -  ilalt  ♦  y  L  )  -  \  (/?  M  f  y  I  )  -  x  i  (al  -*-/?()  .  q 

+  '  +  yH  )  +  \  HaV  +  yW) 

Now  we  state  and  prove  a  lemma  regarding  the  feasibility  of  computing  the 
motion  parameters  using  the  constraint  given  above. 

Lenina  I:  The  optical  flow  components  can  be  expressed  as  an  implicit  polynomial 
equation  Ku.\.x.\  :p,.i  =  1.. .  8)=  0  involving  the  image  coordinates  (x.y)  and  eight 
linearly  Independent  parameters  p,  unless  the  depth  function  is  a  rational 


function 


Q:lxy  ) ' 


where  P\  and  <J:  are  polynomials  of  first  and  second  orders 
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respectively. 

Proof:  Equation  (5.9)  is  homogeneous  in  the  motion  parameters.  Assume  that 
the  parameter  H '  *  0  (The  case  where  w  =  0  but  either  U  or  V  *  0  can  be  worked 
out  in  an  analogous  manner).  Dividing  the  above  equation  by  W  yields: 

(yu  -  x\  )  f  pi  \  -  p?u  +  py  -  pi x  -  p<y  +  pbx'  +  p~\ :  -  p%x\  -  0  (5.10) 

where 


Pl  =  *0 

(5.11a) 

P :  =  Vo 

(5.11b) 

Py  =  ax  c,  +  (1\  n 

(5.11c) 

p- i=o-  y*. 

(5. lid) 

n  =  P-  y>o 

<5.1 le) 

r*  =  y  ~  P> i 

(5.110 

p-  =  y  +  ax  o 

(5.1  lg) 

/'«  =  P  '  *  O  ' 

(5.1  lh ) 

The  parameters  p.'s  are  linearly  dependent  iff 

k ;  t  —  k  yu  +  k  \  —  k  4  <  -  k  s  i  k  bx  ~  —  k  - 1  ■  —  k  «.u  —  0  (5.12) 

where  the  k.'s  are  constants  not  all  of  which  are  zero.  Let  the  optical  flow  be  due 

to  a  rigid  surface  Z  moving  with  velocity  <T.r.iT.a./Ly).  In  this  case: 


l  A  it  —  Q  j  ^  I  .  — 

u  =  - - - a.vv  +  fit x-  +  1)  -  yi 

-  -  (5.13) 

i  = - -y - ah  *  1)  +  px\  +  yx 

Assume  that  the  parameters  p  are  linearly  dependent.  This  implies  that  in 
equation  (5.12)  there  must  be  at  least  one  k.  that  is  not  equal  to  zero.  However,  if 
both  k i  and  k:  are  zero,  then,  all  the  k.'s  must  be  zero.  Hence,  if  the  parameters 
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P_  are  linearlv  dependent,  then  at  least  one  of  k  ■  and  k:  must  be  nonzero. 
Substituting  for  'u'  and  V  in  equation  (5.12)  from  equation  (5.13)  we  obtain: 

k  i  (  — — — — - a.u  -r  /?(  x-  ♦  1 )  -  y  \  I  -  k  —  <*( 1  ‘  -  1 )  -  /?  <»  +  p  I 

+  k  j  —  k  j.v  —  k  5 1  +  A  b.<  Am "  -  A  ^  <i  =  (l 

Since  both  A  •  and  A:  are  not  zero,  we  oh  jin  7  as  a  rational  function  of  the  form 

- .  This  proses  the  lemma. 

Q3u,v) 

Lemna  II:  The  five  parameters  of  rigid  motion  are  be  uniquely  determined  by  the 
parameters  p,. 

Algorithm  IV:  Equation  (5.10)  is  the  basis  of  a  hough  transform  scheme  to 
recover  the  motion  parameters.  The  advantage  of  this  scheme  is  that  the  constraint 
equation  is  linear  in  the  "svnthetic"  parameters  p.  Once  these  parameters  are 
computed  the  five  rigid  motion  parameters  are  umquelv  determined. 

Algorithm  V:  Differentiating  equation  (5.10)  with  respect  to  the  retinal  space 
coordinates  we  have  two  independent  equations: 

O’M.  -  V  -  *',)  +  P  Ox  -  P7U1  ~  P>  +  2Pb  X  -  p  8.1  =  0  (5.14) 

( u  +  yu ;  -  xi,)  +  p i vv  -  p2u,  -  p 5  +  2p 7 v  -  p 8.v  =  0  (5.15) 

The  paramterers  in  equations  (5.14)  and  (5.15)  are  linearly  independent  when  the 

depth  function  is  not  of  the  form  given  in  lemma  1.  Selecting  five  suitable  points 

we  obtain  two  alternative  sets  of  simultaneous  equations  in  five  unknowns.  These 

can  then  be  solved  for  the  five  motion  parameters.  Note,  however,  that  when 

p ;=  v0=0  then  then  equation  (5.14)  alone  cannot  be  used  for  the  computation. 
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This  is  because  the  parameters  tp\.p:.p4.p(,.pg)  cannot  then  be  used  to  solve  for  the 
five  motion  parameters.  A  similar  restriction  holds  for  equation  (5.15)  when 

P 2  =  Vo  =  0. 

Algorithm  VI:  It  has  been  shown  that  when  two  optical  flow  fields  obtained  at 
two  different  time  instants  is  available  then  the  motion  parameters  are  uniquely 
determined  from  measurements  at  three  non  collinear  points  on  the  retina.  The 
assumption  here  is  that  the  motion  parameters  are  stable  during  the  measurement 
period.  This  can  be  used  as  a  basis  for  the  motion  estimation  algorithm. 

Algorithm  VII :  Motion  parameters  from  structure  and  optical  flow. 

When  the  structure  of  the  moving  surface  is  known,  its  motion  is 
unambiguous.  This  method  also  reduces  the  dimensionality  of  the  parameter 
space  by  isolating  the  rotational  parameters.  Two  alternative  constraint  equations 
can  be  used  here.  In  the  first  form  spatial  derivatives  of  the  optical  flow  function 
are  needed.  This  implies  local  analytic  reconstruction  of  the  flow  function.  In  the 
alternative  form  of  the  constraint  depth  ratios  are  needed,  imply  ing  reliable  (  and 
dense)  measurement  of  surface  normals. 

From  eq.  (5.5)  the  expressions  for  the  spatial  derivatives  of  the  optical  flow  <«.»■) 
are  obtained  as: 

=  — -t  -  <  V o  —  '  4j—  -  at  +  2/3  V  (5.16.1) 

/  /*  01 

W,  =  -  <  1 0  -  V  -  at  -  y  (5.16.2) 

/  -  0) 


.Co*  4.V  •  l.  \‘  ■  -V  i  i V .  'A oA2c  2a  2c  A'  jf .  jf  ■l-.VC  0  OjOiIaIC  oVi*  o  o  jLiVCo*  . 


-  Or  -  .1  )-rr-5—  +  PS  +  y 
/~  Ox 


(5.16.3) 
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Substituting  u0-  jr)-|-  and  oc  -  .v  )—■  in  the  above  equations  from  equation  (5.5) 

we  get: 


ux  -  1,  =  (  -  u  -  ax\  +  f}(x~  +  1)  -  7.1  )vf 

-*■  ( 1  -  «( '.  -  1 )  -  fix\  -  yx  )p  ~  a)  +  ft  * 

(5.17.1) 

nu  ■*  Pi  V '  *  1 )  -  71  )p  -  a  i  -  7 

(5.17.2) 

=  (  -  »  -  <n  1  ~  1 )  -t-  fix\  *  yx  )4  -  P)  f  7 

(5.17.3) 

a/ 

where  4  =  and  p  =  -7- 


Thus  at  every  image  location  1  \  ,\  ).  a  set  of  three  linear  independent  equations 
involving  the  rotation  parameters  can  be  obtained.  The  functions  ^<v,ii  and 

p(jc.v)  are  computable  from  the  surface  orientation  values  4r-l>  •  and  44  I .  *  (see 

c\  c  / 

Appendix  II). 

When  it  is  not  possible  to  measure  derivatives  of  the  optical  flow,  but  the 
ratio  of  depths  at  any  two  image  locations  can  be  estimated,  an  alternative  linear 
constraint  equation  can  be  derived  involving  only  the  rotation  parameters. 
Consider  two  image  points  u;,i;>  and  ( x2.y2 )  with  depths  .-1  and  .2  respectively. 
The  optical  flow  values  at  these  points  are  («,.»!>  and  ( u:.\ ?).  The  motion 
parameters  are  U'J'M'.a.p  yi  Using  equation  (5.5)  we  have  the  following 


equations 


U\Z\  ~  Uv; 


(A I  “  A  l)H 


(X  A  M  ;  +  /?(.Vf  +  1)  -  y\  \)  -  :j(  -  OA  2.1  ;  f  /3(  A  M  +  1  )  -  Y.>  : 


v j z i  -  v2r:  =  { v 2  -  \  i)  +  r;(  -  ah  ■}  -*■  1)  -r  /^Apa  +  YAj)  -  r2(  -  a(.v y  1)  +  fix*}  2  *  ya;> 

Eliminating  W  from  the  above  equations  we  have 


/’l2a  -  m  u/3  +  rjjY  +  Ji;—  0 


(5.18) 


where 


/u  =  ALVLV2  -  AM  •/  -  A  ]  -  A;  +  —  (Ap'O’j  -  Al)  J2  -  A!  +  A:) 

Z1 

■> 

mu  =  a i a m  i  _  '  ;\l  ~  1 1  -  .i :  +  — (a iam  2  —  am»  i  —  v i  ■+■  v  2) 

-  i 

rr  =  X  ;A  ■>  f  1  M  —  (  •  -  I  j‘  +  — ~ I  "  A  i"  —  .1’'+  A  )  A  i  +  I  p  j) 

,  _1 

r  ■> 

S 1 2  =  U  i(  \  —  1  ;  >  —  A  >)  "*■  — - (  _  U  2(  l  2  —  _l ■  1 )  +  V  i(  A'  ’  —  V  1 ) ) 

If  the  surface  normal  value  are  available  everywhere  in  a  region  enclosing  two 

image  points,  then  the  depth  ratio.  (corresponding  to  those  locations)  can  be 

estimated  (of  course.  matKm.:t\allv.  it  is  possible  to  compute  this  ratio  if  the 
surface  normal  values  are  known  along  a  path  from  the  one  image  location  to  the 
other).  Consequently,  each  pair  of  image  points  gives  rise  to  a  linear  constraint  in 
the  rotation  parameters.  Thus  bv  a  suitable  choice  of  three  pairs  of  image  points 
we  can  uniquely  solve  for  the  rotation  parameters  and  subsequently  the  translation 

parameters  (see  Appendix  I). 

The  novel  feature  of  tin  above  algorithm  is  that  it  can  combine  shape  and 


motion  information  under  tw,>  diflerrent  conditions: 


(1)  In  the  first  case  the  optical  flow  field  has  been  measured  suflfficientlv  ‘denselv’ 
to  enable  local  reconstruction  of  the  flow  field.  This  enables  the  first  order 
spatial  derivatives  of  the  flow  field  to  be  estimated.  Then  at  all  retinal  points 
where  the  surface  normals  are  known,  we  can  locally  solve  for  the  rotation 
parameters  by  means  of  a  set  of  three  linear  constraint  equations. 

(2)  Alternatively,  if  the  flow  measurements  are  not  dense,  but  the  shape 
measurements  allow  reconstruction  of  the  depth  function  (upto  a  constant 
scale  factor),  then  again  locally  we  obtain  linear  constraints  in  the  rotation 
parameters  (e.g.  equation  (5.18)). 

This  means  that  in  any  image  neighbourhood,  full  reconstruction  of  either  shape 

or  2D  motion,  helps  to  recover  both  structure  and  motion.  The  schematic  diagram 

of  the  algorithm  is  given  in  figure  IV. 

Remarks: 

(i)  Note  the  similarity  between  algorithms  1  and  VII.  In  both,  the  local  anlvtic 
reconstructability  of  either  the  optical  flow  function  or  the  surface  normal 
function,  determines  the  selection  of  the  constraint  equation  that  is  to  be 
used. 

(ii)  From  equations  (5.17),  ^  and  p  can  be  eliminated  to  obtain  a  cubic 
polynomial  equation  in  the  three  rotation  parameters.  Thus  if  the  optical  flow 
and  its  first  spatial  derivatives  are  measured  we  can  use  the  cubic  constraint  to 
estimate  the  rotation  parameters  b>  the  hough  transform  technique.  So. 


although  the  nonlinearity  remains,  the  dimension  of  the  parameter  space  is 
reduced,  w  hich  reduces  the  size  of  the  search  space. 


5.6.  Discrete  motion  under  orthography 

This  case  is  of  interest  to  researchers  in  the  field  of  Visual  Cognitive 
Modeling  [31].  The  reason  for  this,  is  that  psychological  experiments  by  Ullman 
[32]  to  explain  human  capabilities  in  the  perception  of  structure  from  motion, 
agree  more  with  the  orthographic  projection  (actually  an  extension  of  orthographv. 
termed  polar  parallel  projection  [32])  than  w  ith  the  perspective  projection  models. 

For  the  case  of  biological  motion  a  plethora  of  proposals  have  been  put 
forward  by  several  researchers  in  the  area,  and  manv  potentially  powerful 
algorithms  have  been  proposed  [14]  [15]..  [4]  [35]..  The  research  reported  here, 
however  does  not  cover  this  case  of  motion  analysis. 

5.7.  Discrete  motion  under  Perspective 

This  is  the  most  involved  among  all  the  motion  tvpes.  To  simplifv  the 
analysis,  Ullman  [31]  assumed  the  rotation  axis  to  be  along  the  z  axis.  The 
constraint  he  obtained  was  an  equation  of  the  fourth  degree  in  the  sine  of  the 
rotation  angle.  Another  simplification  is  due  to  Fang  and  Fiuang,  whose  "small 
rotation"  assumption  makes  their  analysis  similar  to  the  differential  case.  The 
most  extensive  work  done  in  this  particular  area  is  due  to  Tsai  and  Huang  [30], 
Their  work  is  innovative  and  based  on  elegant  mathematical  formalisms.  However 
a  genera!  unambiguous  solution  to  the  motion  perception  problem  in  case  of 


Smith,  Reid  G.  &  Randall  Davis  (1981)  A  Framework  for  Cooperation  in  Distributed 
Problem  Solving,  IEEE  Transactions  on  Systems,  Man,  and  Cybernetics  ,  SMC- 1 1 , 
no.  1,  Jan. 
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Proceedings  of  AAAI -82,  Camegie-Mellon  University,  Pittsburgh. 
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discrete  perspective  is  still  unavailable. 

6.  Conclusion 

The  problem  of  interpretation  of  a  moving  retinal  image  has  been  studied,  for 
both  "short  range"  and  "long  range"  motion.  Our  findings,  indicate  that  motion 
information  available  in  optical  flow  (  differential  case)  is  less  than  that  in  the 
discrete  displacements  field  (long  range  motion). 

We  saw  that  three  temporal!}  contiguous  image  frames  contain  enough 
information  to  uniquelv  recover  3-D  me  ion  and  structure  under  perspective 
projection.  Since  the  optical  flow  field  (two  temporalis  proximal  frames)  is.  in 
general,  ambiguous,  two  frames  can  recover  structure  when  the  moving  surface 
satisfies  the  conditions  of  Theorem  I. 

We  proved  that  structure  and  3-D  motion  parameters  are  equivalent  -  the  one 
constrains  the  other  uniquelv  -  and  both  problems  (  determination  of  structure  and 
3-D  motion  parameters  from  retinal  displacements)  are  better  tackled  this  wav. 

We  believe  that  our  work  forms  an  important  extension  to  Ullman's  and  Huang  s 
theories,  and,  in  conjunction  with  interpretation  schemes  for  recovering  structure 
in  the  case  of  biological  motion  (using  the  planarity  assumption),  constitutes  a 
significant  advance  towards  the  solution  of  the  problem  of  Motion  Perception. 
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APPENDIX  / 


Uniqueness  of  Motion  Parameters  computed  from  Optical 
flow  under  Perspective  Projection 


Consider  a  point  P  in  space  whose  coordinates  are  ( X.Y.Z )  with  respect  to  a 
fixed  inertial  frame  XYZ.  The  image  of  this  point  is  />  =  u.v)  whose  coordinates 
are  given  with  respect  to  a  xy  frame  located  on  the  image  plane.  The  relation 
between  the  world  point  P  and  the  image  point  p  is  given  by 

where  T"  is  the  focal  length  of  the  imaging  system.  This  is  assumed  to  be  unitv  in 
the  following  analysis. 

Now  if  a  rigid  surface  moves  with  a  translational  velocity  i7  =  <r.r.ir>  and  a 
rotational  velocity  u=(af,y).  Then,  from  kinematics,  the  three  dimensional 
velocity  of  any  point  on  the  surface  can  be  w  ritten  as 


d\  d)  dZ  i  ,,  ,  .  . 

dl  dt  dl 


(11) 


where 't'  is  the  time  variable  and  x'  denotes  vector  product. 

In  differential  motion  case  the  image  motion  or  optical  flow  is  denoted  b> 


iu.\)=  Differentiating  equation  (1)  and  substituting  from  equation  (2)  we 

have  the  following  relations 


u 


i  -  \  H 
/ 


ax\  +  \  +  1)  -  y_i 


(lit. a) 
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»  =  — —  a(.> :  +  1 )  *  (Jx\  -  y  x  (iii.b) 

Eliminating  the  unknown  depth  variable  from  the  above  we  get 

u  y  a  x  i  -/?(-»•+  1 )  f  y\  _  L  -  .x  It  , 

x  +  a(  \  •  +  1 )  —  (i\\  -  yx  l  -  yli' 

The  above  equation  describes  the  constraint  imposed  bv  the  measured  value  of 
optical  flow  (u.v),  at  an  image  point  u..»),  on  the  six  motion  parameters 
(U.l H  .a./f.y). 


Proposition  I.  Giver  the  rotation  parameters  the  translation  parameters  can 
be  uniquely  determined  from  tin  optical  flow  field 

Proof:  First  w-e  define  the  funetion  gu  \  >  where. 


-  Q(  \ "  +  1 )  -  p  -  y  x 


Now  we  analvse  the  following  cases: 


Case  I:  If  p  =  constant  then  from  equation  (iv )  we  have  it  =0.  In  this  case  we  can 
only  obtain  the  ratio  -j-  from  the  optical  flow  field. 

Case  2:  If  n  *  constant  then  there  are  two  image  points  where  n  is  different.  In 
which  case  we  can  solve  the  resultant  set  of  two  linear  equations,  obtained  from 

(iv),  to  get  jc0=  jy  and  .*  c  =  ~  ■ 

Proposition  II.  Given  the  translation  parameters  the  rotation  parameters  can 
be  uniquely  determined  from  optical  flow. 

Proof:  Here  the  values  of  .  and  . are  known.  The  expression  for  optical  flow 
is. 
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u  -  (  -  a  )qp  -  a xy  +  /3(jr  +  1)  -  y\ 


(.i  i  -  \  )<p  -  a(.» :  +  1)  +  fixy  +  yx 


Where  (a,/3.y)  are  the  rotation  parameters  and  <p  =  ~  is  the  reciprocal  of  the 


scaled  depth  function.  If  possible  let  there  be  another  surface  moving  with  the 


same  translation  but  different  rotation  parameters,  but  generating  the  same  otical 


flow.  Thus  we  have. 


u  =  (  a  '  -  a  )<f  -  a  x)  +  /S  (xJ  +  1)  -  y’l 


v  =  ( \  -  i  )qr  '  -  a'(  \  ’  f  1)  +  fi  x)  +  y'x 


Now  from  the  abo\e  sets  of  equations  by  subtracting  appropriately  we  get. 


0  =  (a  -  *  X —  <p ' )  —  Aa.u  +  A/3(a  ‘  +  1)  -  Ayi 


0  =  (\  r  -  t  x -  <f  ’)  -  Attl.i :  *  1)  +  Afix\  i-  Ay  v  (\.b) 

where  Aa  =  a  -  a .  A/S  =  p  -  (i  and  Ay  =  y  -  y .  Eliminating  ( q:  -  «p')  from  the  abo\e 


we  ha\e. 


(Aa  v;  -  A/3i -1  -  *  i  A y  ■  *  Aa)  -  i  (Ay_i  0  +  A/?)  +  ,v:(A/3.vp  +  Ay) 


+  .i  iAa'  -  Ay )  -  a  v(A/3 xq  t  Aa)  o>  =  0 
Since  the  above  equation  is  valid  e\erywhere  in  the  image. 


Aax0  +  A/}\  =  U 
Ayxo  +  Aa  =  U 
Ay.v0  +  Afi  =  " 

From  the  above  we  obtain. 


Aav0  +  AfixQ  =  0 
A/3y0  +  Ay  =  0 
Aaaro  +  Ay  =  0 


Aa  =  0 


A/3  =  0 


Ay  =  0 


This  means  that  a=  a,  fi  =  p  and  y  =  y‘  and  therefore,  the  rotation  parameters  are 


■  •  .  -  -V  % 


uniquely  determined  when  the  translation  parameters  are  known. 

Proposition  III  If  the  structure  of  a  Rigidly  moving  surface  is  known,  then  the 
parameters  describing  its  motion  is  uniquely  determined 

Proof:  Knowing  structure  means  that  we  have  the  depth  values  available  upto 
some  scale  factor.  Thus  in  equation  (iii)  the  value  Z  is  no  longer  an  unknown. 
The  unknown  scale  factor  is  lumped  with  the  translation  parameters.  Now 
proceeding  in  a  manner  analogous  to  the  previous  proof  we  ha\e, 

--(AT  -  jr  AH-)  =  Aajo  -  A /3(.v’  +  1)  +  Ayi  (vii.a) 

-~(Al  -  i  AH')  =  Aa(\  -  +  1)  -  A/3.v\  -  Ay.v  ( v li.b) 

Eliminating  y  we  have. 

(AaA  L  +  A/? A  i  )  -  >r<  AyA  l  *  AaAH  )  -  i  (A/3  A  H  i-  AyA  t  ) 

+  ,v*(AyA  H  +  A/3A  l  )  -  > :( AyA  H  *  AaM  )  -  vUAqAJ  +  A)3AO 
Since  the  above  equation  must  be  valid  all  over  the  image  plane,  the  following 

relations  hold: 

AaAT A/3A)  =  0  AaAH'  +  AyAT  =  0  A/5AH'  +  AyAl  =  0 

AaAt  +  A/SAt  =  0  A/?Al+AyAH=0  AaAC  +  AyA  H  =  0 

From  eqn.  (vii)  and  the  above  relations  we  have. 

At'  =  A  T  =  AH  =  Aa  =  A/3  =  Ay  =  0 

Therefore,  once  the  structure  is  known  for  a  rigidly  moving  surface,  its  translation 
(  upto  a  scale  factor )  and  its  rotation  is  determined  uniquely  from  the  optical  flow 
generated  by  the  motion. 
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APPENDIX  II 


Representations  of  surface  orientation  and  their  properties 


In  computer  vision,  the  terms  surface  orientation  map  and  shape  are 
sometimes  used  interchangably.  The  following  is  an  attempt  to  explain  the  basis 
of  this  usage.  The  cases  of  Perspective  as  well  as  Orthographic  projections  are 
considered.  Shape  information  obtainable  from  a  surface  orientation  map  in 
image  coordinates  is  also  explored. 

Representations  for  surface  orientation 

A  direction  in  three  space  is  specified  by  two  independent  parameters. 

A.  (Latitude.  Longitude):  The  coordinates  are  denoted  b\  <0.9 >  where 

O<0<77  .  0<<Jf<7f. 

B.  Coordinates  on  the  gaussian  (or  unit  radius)  sphere.  If  the  coordinates  are 
il.m.n )  then  /*  +  m~  +  tr  =  1. 

C.  (slant  ,  tilt):  Slant  is  the  tangent  of  ther  latitude  angle  (or  tan0  )  while  tilt  is 
the  longitude  angle.  The  symbolic  notation  is  (err). 

D.  (Gradient):  If  the  depth  is  expressed  in  the  form  Z  =  f(X.Y),  then  it  is  the 
level  surface  F(A  .)'.Z)=  0,  where 


F(X.Y.Z)  =  ftX.Y)  -  7 

7k  f  7k  f 

The  gradient  of  Ft) ,  i.e.  gives  the  orientation  of  the  surface  (  in 

CM  0  7 

the  direction  of  increasing  It  )  ).  The  gradient  notation  is  written  as  tp.qi 


TOW 
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where  (ffff). 


Relationship  among  the  surface  normal  representations : 


V  p:  +  q-  =  tantf  =  a 


—  =  tan®  -  tanr 
P 


(/.m.*)=  (£A— ). 

g  g  g 


g  =  J  p'  +  q:  +  1 


Shape  under  Perspective  Projection 

In  the  case  of  perspective  projection  the  relation  between  a  world  point  < X.Y.Z ) 
and  its  projection  (.*.;■)  in  the  image  plane  is  given  by 

U.l  )=  (i) 

where  F  is  the  focal  length  of  the  imaging  system. 

The  surface  is  represented  in  the  world  frame  by  the  functional  form  /<  \ .)  ).  It  is 
assumed  that  the  surface  can  also  be  represented  (at  least  locallv )  b>  the  function 
j(.x.v)  in  image  coordinates.  Here  the  relation  between  the  surface  normals 

corresponding  to  an  image  point  <*.o  and  the  pamal  derivatives  of 

oX  o  / 

z(x.y)  are  saught. 


A.  Relationship  between  surface  gradients  in  image  and  world  coordinates.  Now  a 
small  displacement  (8x.8y)  in  the  image  plane  corresponds  to  a  displacement 
(SAf.sy.SZ)  in  the  world  frame,  along  the  surface  Z(X.Y).  From  equation  (i)  we 
get  the  relation 


8x7.  +  x 87 


8\7  +  \87 


(ii.b) 


Furthermore  the  following  identity  holds 

7(X  +  8X  .  >'  +  8  Y)  -  z(x  +  8x  .  y +  8y) 
Using  the  Taylor  series  expansion  of  the  above 


Z(X  +  8X  .  Y  +  8Y)=  7(X.Y)+  +  +  {higher  order  terms)  (iv.a) 

:(x  +  8x  .  y  +  0.v)  =  7(x.y)  +  8a-|^-  +  8vyy-  +  ( higher  order  terms )  ( i V .b ) 

Neglecting  the  higher  order  terms  in  equation  (iv)  and  substituting  for  8X  and  8  Y 
from  equation  (ii)  in  equation  (iv.a) 

/(.v  +  8X  .  Y  -r  8  Y)  -  7{X .)  )  =  87  =  1<5aZ  +  a6Z)|^  +  !<«»/  +  rS/)— 

r  ox  r  '  d) 


R7ll  3/  d7  .  s  0/  3/ 

57l/  ‘  'm-  'W’^'Tv'^'TT 


Recall  now  that 


Z(A‘  +  8X  .  Y  +  8Y)~  Z(.V .)')  =  r(A  +  6  a  .  »  +  8.v)  -  r(  x.\ ) 
Therefore  combining  equations  (iii).  (iv)  and  (v) 


* _ 7 _ dZ  .  c. _ 7 _ 0Z  _  ,  dz  «  3r 

6f_3Z_0Z0^  0Z_  0Z  0K  5  3a  0, 

dX  d  Y  *0Af  0}' 

Since  6a  and  6  v  are  independent  of  each  other  we  have 


/  -  A 


~  dX 

0Z  _  0Z 
3A  0} 


(vii.a) 
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3r 

d\ 


dV 


F  -  x 


37 
3  A 


37 

3) 


(\ii.b) 


B.  What  Shape  means  Consider  the  shape  information  available  from  the  field  of 
surface  normals  indexed  bv  the  image  coordinates.  Making  the  appropriate 
substitutions  from  equations  (vii)  in  equation  (iv.b)  we  have: 


37  37 

z(  jc  +  Sx  .  v  +  6» )  . ,  c  3 A  P  3  > 

- zTT T~'  ' 6  >,  jz  JL  * s'  JZ  H 

dx  3)  3 A  3) 

Thus  the  following  statement  can  be  made: 


Under  perspective  projection,  when  the  field  oj'  surface  normals  is  available, 
indexed  by  image  coordinates,  then  the  image  centered  depth  function  can  be 


computed  upto  a  dilation  factor. 


Lenina  I.  If  the  surface  7  is  represented  by  an  algebraic  function  7(\ )  >  and 
furthermore  if  the  function  ;t  • ,» i  denotes  the  same  surface  in  terms  of  the  image 
coordinates  (x.y ),  then  the  tilt  function  tu.i  )  is  given  bv 


37  dz_ 

_  JLL  _  h. 

T  i£ 

3  Y  d) 

Proof:  Since  7.{X.Y)  is  an  algebraic  function,  by  definition  it  can  be  expressed 
implicitly  by  the  polynomial  equation  F(X.Y.Z)  =  0.  We  can  write  Ft  )  as 


HicllkX'Y>7'  =  0  ( v  iii ) 

-  .•  =o 

where  the  c,/s  are  real  constants  and  L.  M,  N  are  finite  positive  integers.  Bv  using 


the  implicit  function  theorem  we  get 


r  = 


3/ 

d) 

az 

dX 


Fy  Fx 


where  FXJ?.FZ  denote  the  partial  derivative  of  F()  with  respect  to  A'.)'  and  /. 
Therefore  we  have  from  equation  (viii): 


/  V  V 

:*0.-=U=0 _ 

/  \1  V 

.  =  i  >=o*=o 


(ix) 


Observe  now  that  we  can  obtain  an  implicit  representation  for  the  depth  in  terms 
of  the  image  coordinates  i  .<  ,i  >  from  equation  (viii)  by  substituting  for  X  and  V  in 

accordance  with  x  =  and  .t  =  ~  (where  the  focal  length  is  assumed  to  be  1). 

Thus  we  obtain  the  representation  <,<x.y.r)  =  0  or 


I  I  ICnkX1}  'r'*'**  -  0 


•  -  0 


Again  by  the  implicit  function  theorem  we  have 


d: 

d\ 

ll 

d.x 


h 

<i  _  <i, 

(>x 


1  M  A 

III 

=0  ,=1*  =0 


j- I *1- j~K 


<i, 

(i 


I.  \f  V 

III  tcukx‘-\vJ:,~-'~k 

/=1  J=0k=  0 


or 


(x) 


!  XI  .V 

,  1 1 1  jc,jkx'yJ-lz‘ 

C)  _  ,  =  0;  =  U=0 
a  -  i  v  s 

at  Ill^-x'-Vr'^*-1 

:  =  !  .1=0*  =0 

Consider  now,  equation  (ix)  and  substitute  a  =  xr  and  y  =  >r 


But  the  right  hand  sides  of  the  equations  (xi)  and  (xii)  are  identical.  This  means. 


3/ 

3  z 

3  >' 

_  _3l 

3  Z 

3  2 

dx 

dx 

which  concludes  the  proof  of  the  lemma. 


Shape  under  Orthographic  Projection: 

Under  orthography  the  image  coordinates  of  a  point  are  equal  to  the 
corresponding  three  dimensional  coordinates,  or 


(x.\  )=  (.V.) ) 


Thus 


3/  a/  3/  3/ 
3>  '  3.  d\  3) 


Now  observe  from  equation  (i\.a)  that  when  the  surface  normals  are  known  at  an 
image  point  u.y).  then  the  depth  difference  between  this  point  and  neighbouring 
image  points  are  known: 


Z(X  +  5*  .  Y  +  6K)  -  Z(X.Y)  =  fiA'-fr  +  6  Y~—  +  ( higher  order  terms) 

OX  0  1 

Thus  we  can  state  the  following: 

When  a  map  of  surface  normals  is  available  under  orthography,  the  depth  function 
can  be  computed  upto  a  constant  additive  term. 


o 


z 


Y 


The  representation  chosen 
assumes  the  body  origin  to 
coincide  with  the  origin  of  the 
reference  frame.  Thus  R  is  a 
logical  extension  of  the  body. 


P  Q  and  R  are  three  points  on 
the  rigid  body  XYZisthe 
reference  frame  The  body 
centered  frame  is  at  R.  The 
motion  of  R  is  given  by  the 
translational  velocity: 

T  =  (  U  ,  V  ,  W  ) 

The  rotational  velocity 
B  =  (  a  ,  B  ,  y  ) 

The  velocity  of  P  is 
(  T  +  n  x  o  ) 


Representation  of  Rigid  Motion 


I 


The  Perspective  Projection  Model 


X 


The  image  p  =  (  x  ,  y)  of  the  world  point  P  =  (X,Y,Z)  is 
projected  by  the  ray  OP  The  focal  length  of  the  system  is 
'F'.  The  equation  of  the  image  plane  is: 

Z  =  F 

The  relation  between  image  and  world  coordinates  is: 
x  =  FX/Z  and  y  =  FY/2 


Figure  II  a. 


CooDe'ative  algorithm  for  the 
comoutation  or  rigid  motion 


parameters  from  optical  flow  and 
shape  nformation.  (Algorithm  Vn) 


Figure  IV 


